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STRUCTURE OF OPTIMAL MARTINGALE TRANSPORT PLANS IN GENERAL 

DIMENSIONS 

NASSIF GHOUSSOUB, YOUNG-HEON KIM, AND TONGSEOK LIM 


Abstract. Given two probability measures ^ and v in “convex order” on R rf , we study the profile 
of one-step martingale plans n on R. d x R^ that optimize the expected value of the modulus of their 
increment among all martingales having and v as marginals. While there is a great deal of results for 
the real line (i.e., when d = 1), much less is known in the richer and more delicate higher dimensional 
case that we tackle in this paper. We show that many structural results can be obtained whenever a 
natural dual optimization problem is attained, provided the initial measure fi is absolutely continuous 
with respect to the Lebesgue measure. One such a property is that ^-almost every x in R^ is transported 
by the optimal martingale plan into a probability measure n x concentrated on the extreme points of the 
closed convex hull of its support. This will be established in full generality in the 2-dimensional case, 
and also for any d > 3 as long as the marginals are in “subharmonic order”. In some cases, n x is 
supported on the vertices of a &(*)-dimensional polytope, such as when the target measure is discrete. 
Many of the proofs rely on a remarkable decomposition of “martingale supporting” Borel subsets of 
R rf x R^ into a collection of mutually disjoint components by means of a “convex paving” of the source 
space. If the martingale is optimal, then each of the components in the decomposition supports a 
restricted optimal martingale transport for which the dual problem is attained. These decompositions 
are used to obtain structural results in cases where duality is not attained. On the other hand, they can 
also be related to higher dimensional Nikodym sets. 
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1. Introduction 

We study the profile of one-step martingales 7r on R rf X 3 d that optimize the expected value of 
the modulus of their increment, among all martingales with two given marginals p and v in convex 
order. More precisely, we investigate the structure of conditional probabilities {n x ) xesupvil on R d 
which describe how a given particle at x is propagated under such transport plans. These questions 
originate in mathematical finance and are variations on the original Monge-Kantorovich problem, 
where one considers all couplings of the given marginals and not only those of martingale type 
GSL EL H3GDL However, unlike solutions of the Monge-Kantorovich problem, which are often 
supported on graphs (such as the well-known Brenier solution HD for the cost given by the squared 
distance), the additional martingale constraint forces the transport to split the elements of the initial 
measure p. One cannot therefore expect -but in trivial cases- that optimal martingale plans be 
supported on graphs. 

These questions are motivated by problems in mathematical finance, which call for no-arbitrage 
lower (or upper) bounds on the price of a forward starting straddle, given today’s vanilla call prices 
at the two relevant maturities. Just like in the Monge-Kantorovich theory for optimal transport, these 
problems have dual counterparts, whose financial interpretation amount to constructing the most (or 
least) expensive semi-static hedging strategy which sub-replicates the payoff of the forward starting 
straddle for any realization of the underlying forward price process. 

The minimization and maximization problems are quite different, though by now well under¬ 
stood when the marginals are probability measures on the real line, at least in the case of one-step 
martingales. We refer to Hobson-Neuberger fl20l . Hobson-Klimmek lH9l . and Beiglbock-Juillet 0. 
For the multi-step case, see Beiglbock-Henry-Labordere-Penkner 0. The dynamic case have been 
also studied by Galichon-Henri-Labordere-Touzi Ifl4ll and Dolinsky-Soner If i Ol f111. The two cases 
studied are when the cost is either c(x,y) — \x - y|, which is the main focus of this paper, or the 
case when the cost satisfies the so-called Mirlees condition. Note that the one-dimensional case is 
closely related to Skorohod embedding problems l24l . since real valued martingales can be realized 
as adequately stopped Brownian paths. See for example Hobson lfl8l . Beiglbock-Cox-Huesmann 
|[6l and Beiglbock-Henry-Labordere-Touzi |]4| . 

Surprisingly, much less is known in the case where the marginals are supported on higher dimen¬ 
sional Euclidean spaces R d . In this direction, Lim l22l considered the optimal martingale transport 
problem under radially symmetric marginals on R d , while Ghoussoub-Kim-Lim consider in lfl6l the 
corresponding optimal Skorokhod embedding. In this paper, we shall tackle the following general 
optimization problem associated to a cost function c : R^ x R d —» R: 

Maximize / Minimize cost[7r] = I c(x,y) dn(x,y) over n e MT(p,v). (1.1) 

JK>xM. d 

Here MT(p, v ) is the set of martingale transport plans , that is the set of probabilities 7r on R d x R d 
with marginals p and v, such that for //-almost x e R d , the component n x of its disintegration (n x ) x 
with respect to p, i.e. dn(x, y) = dn x (y)dp(x), has its barycenter at x; in other words, for any convex 
function ip on R d , one has i p(x) < (p(y)dn x (y ). 

One can also use the probabilistic notation, which amounts to 

Maximize / Minimize Ep c{X , Y) (1.2) 

over all martingales ( X , Y ) on a probability space (Q, f, P) into R rf X R rf (i.e. E[Y\X] = X) with 
laws X ~ p and Y ~ v (i.e., P(X e A) = p(A) and P(Y e A) — v(A) for all Borel set A in 
R d ). Note that in this case, the disintegration of n can be written as the conditional probability 
n x (A) = P(F e A\X = x). 

A classical theorem of Strassen |28l states that the set MT(p,v) of martingale transports is 
nonempty if and only if the marginals p and v are in convex order , that is if 

(1) p and v are probability measures with finite first moments, and 

(2) J^ r/ ip dp < f.j dv for every convex function i/ on R rf . 
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In that case we will write p <c v, which is sometimes called the Choquet order for convex functions. 
Note that x is the barycenter of a measure v if and only if 6 X < c v, where 6 X is Dirac measure at x. . 

We will mostly consider the Euclidean distance cost c(x, y) = \x - y| unless stated otherwise, 
although some of the results below hold for more general costs. We shall use the term optimization 
in problem 0> whenever the result holds for either maximization or minimization. We shall be 
more specific otherwise, since it will soon become very clear that the two cases can sometimes be 
fundamentally different. The following theorem summarizes the main structural result when p and 
v are one-dimensional marginals. Hobson-Neuberger l(20l were first to deal with the maximization 
case while Beiglbock-Juillet 0 and D. Hobson and M. Klimmek lfT9l deal with the context of 
minimization. 

Theorem 1.1. (Beiglbock-Juillet 0, Hobson-Neuberger l20l . Hobson-Klimmek II 1911 ) Assume that 
p and v are probability measures in convex order on R, and that p is continuous. There exists then a 
unique optimal martingale transport plan n for the cost function c(x,y) = \x — v|, such that: 

(1) If n is a minimizes then its disintegration satisfies |supp7r t | < 3 for every x e R. More 
precisely, n can be decomposed into n stay +n go , where n stay = (Id°xId)#(pAv) (this measure 
is concentrated on the diagonal of R 2 ) and n go is concentrated on graphlTi) U graph^) 
where T\, 73 are two real-valued functions. 

(2) If n is a maximizer, then its disintegration satisfies \ supp7r x | < 2 for every x e R, and n is 
concentrated on graph(ri) U graph^) where T\, To are two real-valued functions. 

Our main goal in this paper is to consider higher dimensional analogues of the above result. In 
1221 . Lim showed that the above theorem extends, in the case of minimization, to the setting where 
the marginals are radially symmetric on R. rf and c(x, y) — \x - y\ p for 0 < p < 1. The general case is 
wide open and our goal is to work towards establishing the following: 


Conjecture 1: Consider the cost function c(x,y) = |jr—y| and assume that p is absolutely continuous 
with respect to the Lebesgue measure on R d (p « Jfi 1 ). If n is a martingale transport that optimizes 
O- Then for p-almost every x, supp7r x coincides with the set of extreme points of the convex hull 

of supp ti x , i.e., supp 7r x = Ext |conv(supp /r A )j. 

Remark 1.2. If supp n x is bounded for /.(-almost all x (which is the case in particular when the target 
measure v is compactly supported), then conv(supp n x ) = convfsupp n x ). In this case, the set of 

extreme points Ext jconvfsupp 7r A )j is also called the Choquet boundary of the compact convex set 

convtsupp 7T X ). Our conjecture can therefore be rephrased as: For p a.e. x, supp n x is equal to the 
Choquet boundary of its closed convex hull. 


Note that for the minimization problem, we can and will assume that p A v = 0 since any 
minimizing martingale transport for problem o> must let the support of p A v stay put. See 0 
or If22l for a proof. One can then easily see that in the one dimensional case, the above conjecture 
reduces to Theorem |l.l| since then the dimension of the linear span of supp n x is one and the Choquet 
boundary consists of exactly two points, unless of course supp n x is a singleton. 

We shall be able to prove the above conjecture in many important cases, in particular, when a 
natural dual optimization problem is attained (Theorem \2A) , or when the linear span of supp n x 
has full dimension (Corollary 2.13) . Another case where the answer is affirmative is in dimension 
d - 2 (Theorem 2.14) provided the second marginal has compact support. The conjecture also holds 
partially (Theorem|6.1|l when the marginals are in “subharmonic order,” that is if 


r <pdp< f 

Jr d jR d 


ip dv for every subharmonic function ip on ] 


We actually expect to have a more rigid structure in the case of minimization. Indeed, Lim EH 
showed that in this case, assuming p A v — 0, we also have [ supp n x \ < 2 for /(-almost all x, whenever 
the marginals are radially symmetric on ~i d and c(x,y) = \x - y\ p for 0 < p < 1. The general case 
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remains open as we propose the following: 

Conjecture 2 (Minimization): Consider the cost function c(x,y ) = \x — y| and assume that p is ab¬ 
solutely continuous with respect to the Lebesgue measure on W 1 , that p A v = 0. If n is a martingale 
transport that minimizes 0- Then for p almost every x, the set supp n x consists ofk + 1 points that 
form the vertices of a k-dimensional polytope, where k := k(x ) is the dimension of the linear span of 
supp 7t x and therefore, the minimizing solution is unique. 

We shall give a partial answer to the above conjecture under the assumption that the target mea¬ 
sure v is discrete. Actually, in this case the result holds true in both the maximization and minimiza¬ 
tion cases (Theorem |2.15| >. We note however that -unlike the minimization case- one cannot always 
expect in higher dimensions neither the uniqueness of a maximizer (Example |2.17| >, nor a polytope- 
type structure for supp n x (Examplc |2. 1 6[ >, even when the marginals are radially symmetric. 

Just like in the Monge-Kantorovich theory, the above optimization problem 0 has a dual 
formulation, which will be crucial to our analysis. And similarly to that theory, the dual problem 
can be studied independently of the primal problem and without any underlying reference measures. 
Recall that for the quadratic cost studied by Brenier, the dual problem amounts to considering convex 
functions /?, their Fenchel-Legendre duals a f>* and the set E = \{x,y) e R rf X R d ;/3(y) + a(x) — 
(x,y)}, which happens to be the graph of the subdifferential of /j. Similar but more complicated 
phenomena arise in our situation. We shall work with the following notions. 

For a subset E in R rf X R 0 ', we shall denote by E v , the fiber Ev := {y e R d ; ( x,y ) £ T). For a Borel 
set T c R’ 2 ' x R rf , we write Xy := projxr, Yy := proj Y l ’, i.e. Xy is the projection of E on the first 
coordinate space ~i d , and Yy on the second. 

Definition 1.3. Fet c : 9. d x R d —» R be a cost function and let X,Y cR* 1 be two Borel sets. 

(1) We say that a triplet of measurable functions (a, y,(T) is an admissible triple on X x Y, if 
a : X —» R, (3 : Y —> R, and y : X —> R d satisfy the following inequality 

/3(y) - a(x) - y(x)(y - x) < c(x,y) for all (x,y) e XxY. (1.3) 

We shall denote by E m {c,X, Y) the set of all such admissible dual triples . A similar defi¬ 
nition holds when the inequality is reversed, and the set of those triplets will be denote by 
E m (c,X, Y). Note that E M (c,X, Y) = E m (-c,X, Y). 

(2) For an admissible triple (a,y,/3), we will consider the set where equality holds, that is 

^(a.yfi) ■= {(x,y) eXxY \ (Xy) - a(x) - y(x) ■ (y - x) = c(x,y)}. (1.4) 

We shall sometimes allow y to be a set-valued function. In this case, the above inequal¬ 
ity/equality will mean that they actually hold for any vector b in y(x). 

(3) Any non-empty subset of will be called a c-contact layer for (a, y,j3) in XxY . When 

the ambient space is not specified, it means that it is simply Xy x Y\ . 

We shall sometimes say that a set E is c-exposed by the admissible triple ( a,y,(3) if it is 
contained in 

Denoting E m = E m (c, R rf ,R rf ), one can then show (see for example 0) that if the cost c is lower 
semi-continuous, then for the minimization problem, 

mini I c(x, y) dn\ n e MT{p, v) l (1.5) 

lJR‘'xR rf " J 

= supi [ (3dv - I adp', (a,y,/3) e E m for some y e Cb(^ d , R rf )i ■ (1.6) 

J R d J 
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Similarly, if the cost c is upper semi-continuous, then 


max 


f c(x,y)d7r, 7 t e MT(/u,v) 

Jk^xR'' 


= inf 


f jSdv- [ i 

jR d J R d 


adp ; (a, y,p) e Em for some y e 


d -ad- 


(1.7) 

( 1 . 8 ) 


Note that if n is an optimal martingale measure and if the corresponding dual problem is attained on 
a triplet ( a , y,/3). then it is easy to see that there exists a Borel subset T c R^xR' 1 with full 7r-measure 
that is a c-contact layer for ( a , y,/3 ), namely. 


/3(y) - a(x ) - y(x)(y - x) = c(x, y) if and only if (x, y) e T. 


(1.9) 


We shall show that such c-contact layers have a specific extremal structure. As a result, any martin¬ 
gale transport n 6 MT(p, v) which is concentrated on a c-contact layer, when c(x,y) = ±\x - y\, will 
satisfy the conjecture (1). 

Recall that in the Monge-Kantorovich theory for mass transport, the dual problem is normally 
attained, and the “corresponding c-contact layer” is a set of the form T = {(x, y); / 3(y)-a(x ) = c(x, y)}, 
where p and a related through c-Legendre duality, which let them inherit some of the regularity 
properties of c. We shall follow a similar methodology here by defining and exploiting in Section [3] 
a notion of martingale c-Legendre duality between the function /j and the pair (a, y). This will allow 
us to establish the regularity properties needed to analyze the structure of c-layer sets. 

However, unlike the Monge-Kantorovich setting, attainment of the dual problem does not often 
hold for optimal martingale transports -at least in the maximization problem- even in the one¬ 
dimensional case, as shown in 0. See Example |5.7| below. We therefore explore whether dual 
attainment can happen locally, which is sufficient to imply Conjecture (1). We prove in Section[6]that 
it is indeed the case under suitable assumptions on the marginals, such as when they are comparable 


for the order induced by subharmonic functions; see Theorem 6.1 


More importantly, we then proceed to consider the general case by establishing a remarkable 
decomposition for any Borel set T supporting a given optimal martingale transport n into disjoint 
components {Tdce/ in such a way that each piece is a c-contact layer for an admissible triplet 
(a c , yc-Pc)- What is remarkable is that this decomposition into c-contact layers can be established 
in full generality (i.e., for any cost function) and without any reference to a martingale transport 
problem or even to any reference measure. The decomposition is done through an equivalence 
relation on the projection X\ of T on the first coordinate, that is induced by a well chosen irreducible 
convex paving , that is a collection of mutually disjoint convex subsets in R d that covers X t \ See 


Theorem 2.11 for the precise statement. 

We note that this result can be seen as a generalization of the decomposition of Beiglbock-Juillet 
0 in the one-dimensional case d — 1, where the disintegration comes from restricting the measures 
p, v onto open subintervals of R obtained by examining the potential functions for p, v. Like theirs, 
our decomposition applies to any cost function c and not only to c(x,y) = \x - y|. It is however 
quite different since it depends on the support of the martingale measure 7t that we start with. More 
importantly, our decomposition needs not be countable (Example |9.3[ ) which creates additional and 
interesting complications for the higher dimensional cases. 

We shall use the above decomposition to establish the above stated conjectures under various 
conditions. For example. Conjecture 1 holds in dimension d = 2 (Theorem |2.1 4[ >, and also in the 
case where the dimensions of all components (C)cei are //-dimensional (see Corollary |2.13| >. 

Remarkably, the results discussed so far do not distinguish between the minimization and maxi¬ 
mization problems (except that we assume that//Av = 0 in the case of minimization). The previously 
mentioned decomposition can be used to prove Conjecture 2) in either the minimization and maxi¬ 


mization case, provided the target measure v has a countable support (Theorem 2.15 1 . However, as 
mentioned above, we believe that these two problems are quite different, at least in terms of finding 
finer structural results for each of the cases. 
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Back to the martingale problem, we then consider the disintegration I^cIcg/ of any martingale 
measure n along the above described decomposition of its support I (Theorem |9.1| l. This then gives 
a canonical decomposition of the optimal martingale problem into a collection of non-interactive 
martingale problems where duality is attained for each piece n c in MT(pc, vc). 

In the next section, we give the precise statements of our results. In Section[3] we introduce and 
study the notion of martingale c-transforms, which will be used to improve the regularity properties 
of admissible triples. This will be used in Section[4]to analyze the structure of c-contact layers that 
are exposed by such triples. We apply these results in Section[5]to the case where the dual problem 
is attained, proving that Conjecture (1) holds in this situation. In Section 6, we give a setting where 
the dual problem is attained locally, showing Conjecture (1) for a case where the marginals are 
in subharmonic order. In Section [7] we establish the decomposition, as well as the existence of 
admissible triplets exposing each of the components. Section [8]is devoted to proving under various 
additional conditions, structural results for sets where optimal martingale transports concentrate, 
while Section[9]deals with the disintegration of martingales along this decomposition and how it is 
related to the presence of Nikodym sets. 

The authors are extremely grateful to Luigi Ambrosio for several pertinent remarks and discus¬ 
sions regarding the results of this paper. The first-named author also thanks David Preiss for very 
helpful discussions and insight. 


2. Main results 

To discuss our main results, we first introduce a few definitions. We also borrow some of the 
notation from 0. 

Definition 2.1. For A C R d , we shall write V(A)for the lowest-dimensional affine space containing 
A. Also define IC(A) int(conviA)) and CC(A) cliconviA)), where again the interior or closure 
is taken in the topology ofV(A), where the topology of a set A is with respect to the Euclidean metric 
topology ofV(A) (and not with respect to the whole space W 1 ). 

If A = {x}, then IC(A) = {a} since we consider the interior of a singleton set is itself in the 
topology of O-dimensional space. 

In reality, we will be dealing with the vertical fibers T A = {v e R d | (x,y) 6 T) of a certain class 
of Borel sets TcR^x R rf , on which martingale measures n e MT(jj, v ) would be concentrated. The 
constraint that x is the barycenter of n x , which is normally supported on T v , naturally leads us to the 
following definition. Recall that for a Borel set T c R rf x R d , we write A) := projxT, Y\ projyT, 
i.e. Xr is the projection of T on the first coordinate space R d , and Y\ on the second. 

Definition 2.2. We say that a Borel set T c R d x R d is a martingale supporting set, if 

for every x e Xr, x e IC(F X ). (2.1) 

We let Smt denote the class of all martingale supporting sets. 

Our first main result shows that martingale supporting sets that are c-contact layers enjoy special 
structural properties. A key step established in Section [3] is to show that an exposing admissible 
triple can be extended and regularized via a notion of martingale c-Legendre transform, so that it 
verifies the needed differentiability properties. 

Theorem 2.3 (Regularization of admissible triples via martingale -Legendre transform). Let c 

be a cost function on R rf such that x t-» c(x,y), resp. y i-> c(x,y), is locally Lipschitz, where the 
Lipschitz constants are uniformly bounded in y and respectively, in x. Let T be a Borel set in Smt 
that is a c-contact layer, and suppose that X r C O := IC(Y r ) with Q being an open set in R rf . Then, 

(1) There exist a locally Lipschitz function a : FI — * R, a locally bounded y : LI —» R rf , and 
P : R rf —> R, such that T is a c-contact layer for the triplet (a, y, t 8). 

(2) If the admissible triple is in Em(c,X p, Yr), and if y t-» c(x,y ) is assumed to be convex, then 
f can be taken to be a convex function on R rf . 
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(3) Ifc(x,y ) = \x — y\ and the admissible triple is in E m (c,Xr, Yr), then a — /3 on Q. 

This will allow us to prove the following structural result. 

Theorem 2.4 (Extremal structure of a martingale supporting c-contact layer). Let c(x, y) = 
+[x - y| and assume T is a c-contact layer in Smt- Then for SL d - a.e. x in Xr, the closure T v ofT x 
coincides with the set of extreme points of the convex hull ofY x , i.e., T x = Ext (conv(T v )). 

In particular, if p is a probability measure that is absolutely continuous with respect to the 
Lebesgue measure, and if the dual problem is attained, then for any n € MT(p, v ) that is a solu¬ 
tion of for either the minimization or maximization problem, then for p - a.e. x, supp7T x = 
Ext (conv(supp 7t x )). 

We shall see that the dual problem is not always attained. However, a localized version of the 
above theorem will allow us to deal with a case where the marginals are in subharmonic order. 
Actually, by letting Ij, be the Newtonian potential of a probability measure p, we shall be able to 
deduce the following result (see Section[6]). 

Theorem 2.5 (Case of marginals in subharmonic order). Assume p < SH v where p. v are prob¬ 
ability measures with compact support on R rf such that p « Jf 1 (d > 3), and that the open set 
{x | P Y (x) - P/j(x) > 0) has the full measure of p. Ifn e M(p, v ) is an optimal solution for the problem 
(O' where the cost function is c(x,y) — ±\x — y|, then for p - a.e. x, supp/r x = Ext (conv(supp n x )). 

Since martingale supporting sets Y in Smt are not always c-contact layers even when they are 
concentration sets for optimal martingale transports (|j3j or Example |5.7| below), we investigate the 
possibility of decomposing such sets into “irreducible components” such that each component be¬ 
comes a c-contact layer. For that, we introduce the concept of a convex paving. 

Definition 2.6. Let <1* be a family of mutually disjoint open convex sets in M, d (Recall that here, the 
openness of a set C is with respect to the space V(C)). Given a set T c R rf x W 1 , we shall say that O 
is a convex paving for Y provided 

(1) X r £ Uce® C. 

(2) each C € <D contains at least one element x in Xr (C is then denoted C(x)). 

(3) For any z,x e Xr, we have: IC(Y Z ) n C(x) + 0 => /C(T-) c C(x). 

Note that such a paving clearly defines an equivalent relation on X t by simply defining x x' if 
and only if C(x) = C(x'). The corresponding equivalent classes are then [x] = C(x) n A) . 

There can be many convex pavings of a set Y c Rfi x R d ; take for example <h := {P/ / } which however 
doesn’t give much information about T. We therefore introduce the following concept. 

Definition 2.7. For a fixed set Y c R rf x M. d , we shall say that <1> is an irreducible convex paving for 
r if for any other convex paving T for T, we have the following property: If C e <T>, D 6 T are such 
that C fl D + 0, then necessarily C C I). 

Note that an irreducible convex paving for a set Y is necessarily unique. As to their existence, we 
shall show in Section[7]the following result. 

Theorem 2.8. For every martingale supporting set T in there exists a unique irreducible convex 
paving for E. 

Now, a key property of optimal transport plans in Monge-Kantorovich theory is that they are 
concentrated on Borel sets that are c-cyclically monotone , which is a property that describes every 
finite collection of points in the concentration set Ii29l . Similarly, a key property of an optimal 
martingale transport n e MT(p, v ) - due to Beiglbock and Juillet |5| - is a monotonicity property 
enjoyed by every finite collection of points in their support. It implies in particular, that there exists 
a set A of full ^--measure in x R d such that each one of its finite subsets is a c-contact layer. This 
is one of the consequences of the variational lemma in 0, where duality on finite sets is obtained 
via linear programming (see 0 and 03). We therefore introduce the following combinatorial 
counterpart of cyclic monotonicity for martingale transport. 



NASSIF GHOUSSOUB, YOUNG-HEON KIM, AND TONGSEOK LIM 


Definition 2.9. A subset A of R rf x W 1 is said to be c-finitely exposable for some cost function c, if 
each one of its finite subsets is a c-contact layer. 


The following proposition describes the combinatorial nature of the support of of optimal mar¬ 
tingale transports. 

Proposition 2.10. Let n e MT(p, v) be an optimal martingale transport for Problem m Assuming 
the cost c continuous, then there exists a c-finitely exposable concentration set A. for n. 

Indeed, it is shown in a (see also ED) that there exists a Borel set A in R rf x R rf with 7r(A) = 1, 
that satisfies a certain monotonicity property, which is the martingale counterpart of the c-cyclic 
monotonicity that is inherent to the Monge-Kantorovich theory. As mentioned above, by the duality 
theorem of linear programming, this property is equivalent to saying that every finite subset of A is 
a c-contact layer. 

Since duality is not attained in general, an optimal martingale transport measure is not necessarily 
concentrated on a c-contact layer T e Smt ■ On the other hand, we can and will assume that it is 
concentrated on a set T e Smt whose finite subsets are c-contact layers. This leads to the question 
of finding “maximal” components of T that are c-contact layers. It turns out that this is indeed the 
case as we show that Tc := T n (C x R‘ l ) is a c-contact layer for any component C of the irreducible 
convex paving C> of T. It is summarized in the following theorem. 


Theorem 2.11. Let Y be a c-finitely exposable set in Smt, then there exists an irreducible convex 
paving <Y>for Y such that for every convex component C in <1>, the setYC\(C X R d ) is a c-contact layer. 


Remark 2.12. Theorem 2.11 can be seen as a martingale counterpart to a celebrated result of Rock- 
afellar l25l in the Monge-Kantorovich theory for mass transport, which essentially says that the 
property of c-cyclical monotonicity that characterizes the support of optimal transport plans are 
somewhat “c-contact layers” exposed by a pair of functions, one being c-convex and the other being 
its c-Legendre transform. Here, c-finite exposability replaces c-cyclic monotonicity, while “expos¬ 
ing” martingale supporting sets require a new notion of duality between a function fi and a pair of 
functions (a, y). However, in the martingale case, the whole support is not necessarily a c-contact 
layer, but every irreducible component is. 


Theorems |2. 3 1 and |2 .11 yield several structural results in general dimensions such as the following. 
Note that the attainability of the dual problem is not assumed here. 


Corollary 2.13 (dimensional result). Let n he a solution of the optimization problem <o with 
c(x,y) = \x — y\ and suppose p is absolutely continuous with respect to the Lebesgue measure. Then 
for p-almost every x in H d , 

(1) the Hausdorff dimension o/supp n x is at most d — 1, and 

(2) IfdimV(suppn x ) = d, then supp7t x = Ext (conv(supp 7r x )). 


Proof. Indeed, there exists Y e Smt with niY) = 1 that is c-finitely exposable, and such that Y x — 
supp n x for p a.e. x (See Appendix A) . Now , consider those points x with dim V(r x ) = d. In this 
case, the disjoint sets C(x) in Theorem 2.11 are open sets in 3 d and so, the restriction of p to each 


of the components is again absolutely continuous. Theorems 2.3 and 2.4 can then be applied. Note 
now that the the set of extreme points has dimension at most d — 1. This shows that for p- a.e. x in 
the open set Udim v(C)=d C, we have that dim(T v ) < d — 1. The property also obviously holds outside 
that set, which means that item (1) is also verified. □ 


A more involved application of the decomposition is a complete solution of Conjecture 1) in two 
dimensions, namely the following, which is proved in Section 8. 

Theorem 2.14. Assume d = 2, c(x, y) — [x — >j, p is absolutely continuous with respect to the 
Lebesgue measure, and v has compact support. Let n e MT(p, v) be a solution of Q then for p - 
a.e. x, supp7r x = Ext (conv(supp nff). 
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The decomposition also allows us to give in Section 8 the following positive answer to Conjecture 
2, whenever the target measure is discrete. Note that in this case, the result holds true in both the 
maximization and minimization problems. 

Theorem 2.15. Let c(x,y ) = |x — y\ (or more generally for c(x,y ) = \x — y\ p with p + 2), suppose 
p is absolutely continuous with respect to the Lebesgue measure, and that v is discrete; i.e. v is 
supported on a countable set. If n e MT(p,v) is an optimizer for <o> then for p - a.e. x, supp;r v 
consists of exactly d + 1 points which are vertices of a d-dimensional polytope in W 1 , and therefore 
the optimal solution is unique. 

Now we give a couple of examples, which illustrate that the above stated conjectures could be 
the best structural results we can hope for. 

Example 2.16. The polytope-like structure of the support required in Conjecture 2 does not hold in 
general for the corresponding maximization problem. Indeed, since |(|x - y| - l) 2 > 0, we have 

^\y\ 2 -hx\ 2 + l-x-(y-x)>\x-y\ onR d xR d , (2.2) 

with equality on the set {(x,y); \x - y\ = 1). The functions a(x) = j\x\ 2 - 1, /3(y) = j|y| 2 and 
y(x) = x then form a dual triplet for the maximization problem with cost \x - y\. This means that 
every martingale ( X , Y ) with \X - Y\ = 1 a.s. is optimal for the maximization problem corresponding 
to its own marginals X ~ p and Y ~ v. Hence, supp n x is not in general a discrete set, and indeed, 
supp 7t x can attain the Hausdorff dimension d - 1. 

We now consider the uniqueness question in Conjecture 2, and whether it could hold for the 
maximization problem. In j5) it is shown that when d = 1 the solution of the martingale trans¬ 
port problem <n) is unique for both max/min problem under the assumption that p is absolutely 
continuous. Also, it is reported in Il22l that in the minimization problem with radially symmetric 
marginals (p, v), the minimizer is again unique in any dimension. We note however that, unlike the 
minimization case, one cannot expect the uniqueness of a maximizing martingale measure in higher 
dimensions, even in the radially symmetric case, as the following example indicates. 

Example 2.17. Let p be a radially symmetric probability measure on R 2 ^ C such that /z({0}) = 0. 
Let zi = cos | + i sin f,z.i= cos | — i sin |, Z3 = ~Z\ and za = —Z2, and define the probability 
measures ti\ and ^ on C X C, whose disintegrations n\ and 7r 2 for each x e C,r f 0, are given by, 

! 1 1 1 1 

n x - + 4 djc+ H® + 4^ + H® + 4^+ifa 

2 _ 1 3 1 3 

^ “ % 5 x + t \ z ' + + 8 5 - v+ h Z3 + 8 <5 * + h Z4 ' 

Then, by the discussion in Example |2T6j one can see that both zr| and no are optimal for the max¬ 
imization problem corresponding to p and v := v\ = vz, where dvfy) = J r n' x (y) dp(x), i = 1,2, 
hence, the maximizer is not unique. 


Finally, we consider in Section 9 whether one can perform a disintegration of n with respect to 
the decomposition {r c } Cc<[) into components (n c ) c in such a way that each 7r c is a probability mea¬ 
sure supported on Tc := T n (C x R' ; ) and tt c e MT(pc,vc), where pc.vc are suitable probability 
measures in convex order, with pc is supported on X c := Xy Pi C and vc on Yy c . The advantage 
of this decomposition is that If n is optimal for problem o> in MT(/i, v), then nc is optimal for 
the same problem on MT(pc , vc), with the added property that r ( - is a c-contact layer, which means 
that duality is attained for each tiq. The decomposition of T given by Theorem 9.1 was motivated 
by a similar one proposed by Beiglbock-Juillet |5j in the one dimensional case (d = 1). Our decom¬ 
position is however quite different since it depends on the concentration set F for n, while in their 
case the decomposition depends only on the marginals p and v. Theirs is also a countable partition, 
which makes the restricted problems much more amenable to analysis. Actually, the intervals in 
their decomposition are simply the connected components of the set where the potentials of p and 
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v are different on the real line. However, in the higher dimensional cases our decomposition can be 
uncountable, and that’s why we talk about a disintegration as opposed to a decomposition. More¬ 
over, the induced probability measures pc's can be Dirac measures (see Examplc |9.3[ i, which means 
that Theorem |2.4| may not be applicable to each piece 7 tc even if duality is attained for the restricted 
problem. We refer to Section 9 for the challenges and the interesting questions arising from this 
fundamental decomposition. 


3. The martingale c-Legendre transform 

In this section, we investigate properties of the admissible triplet of functions that appear in 
the dual martingale problem and their associated contact layers. Note that in the case of standard 
mass transport problems, the contact layer is determined by a potential function and its c-Legendre 
transform, whose regularity properties are inherited from those of c, and which can be studied in¬ 
dependently of the primal transport problem. A similar methodology works in our setiing, once we 


introduce an appropriate Legendre duality. 

Definition 3.1. Let Y be a Borel set in R d such that O := IC(Y ) is open in R d , and let p : Y —■» R be 
a Borel function such that for some ,v e P., t e R rf , xq e Q, we have 

P(y) < c(xq, y) + t ■ (y - X(>) + s for all y e Y. (3.1) 

(1) The martingale c-Legendre dual of the function /3 on LI is the pair fi c := ( a c ,y c ), where 

a c : LI —» R is given by 

a c (x) := infja e R : 3b e R d such that /3(y) - c(x,y ) < b ■ (y - x) + a, Vy e Y], (3.2) 

and y c : LI —> R rf is the possibly set-valued function defined by 

Yc(x) := [b e R d : p(y) - c(x,y) < b ■ (y - x) + afx), Vy e Y}. (3.3) 

(2) The martingale c-Legendre dual of a pair of functions {a, y) : O —> R X R d is the function 

(a, y) c : R d —> R defined by 

(a, y) r (y) inf (c(x,y) + b ■ (y - x) + a(x)}. (3.4) 

xeQ,bey(x ) 


(3) We shall denote by (f , the martingale c-Legendre dual of the pair [f - = (a c , y r ), and say that 
f is martingale c-convex on Y, if f — ft cc on Y. 

In order to emphasize the analogy with the standard Fenchel-Legendre duality, we shall write 
p c (x,y) = (« c , y c )(x,y) := a c (x) + y c (x)(y - x). 

Theorem 3.2. Assume that (x,y) i-> c(x,y ) is continuous and x 1 —> c(x,y) (respectively y i-» c(x,y)) 
is locally Lipschitz with local Lipschitz constants uniformly bounded in y (respectively in x). Let Y 
be a Borel set in R d such that O := IC(Y) is open in R d , and let /3 : Y —> R be a Borel function 
satisfying (HZ]), andp c — (a c , y e ) its martingale c-Legendre dual. Then 

(1) a c is locally Lipschitz in O, while y c and p cc are locally bounded in LI. 

(2) p < p cc on Y, and 

Pcc(y) ~ Pc(x,y) < c(x,y ) for all (x,y) eLlx R d . (3.5) 

In other words, the triple ( p c ,Pcc ) = (a c , y c ,Pcc ) £ E m (c, LI, R d ). 

(3) p cc (x ) — c>i < a c (x ) < p cc (x) + 62 for all x € LI, where 

di = supc(x, x) and <52= sup [c(x,y) - c(x,x') - c(x',y)]. 

xeQ. x,x'e£l,yeY 

(4) Let X c LI and let (a, y) be defined on X such that (a, y, P) 6 E,„(c,X, Y), then a(x) > a c (x ) 
on X. Moreover, if a c-contact layer Y c r ( a ,yji) belongs to Smt, then 

a(x ) = a c (x ) and y(x) C y c (x) on X r , p cc = P on Y r , and Y C Y(a c ,y c p cc )- 

(5) The function p cc is martingale c-convex on R d , that is p cc — Pcccc on 
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Proof. (1) We first show that a c is locally bounded in Q. For x e Cl - IC(Y), we may choose 
{y 1; ...,y s ) c Y such that 

x 6 U := /C({yi, ...,y s }) and U is open in R d . (3.6) 

Since x = 2; f;y ; , 2; 6 = 1, 6 > 0, it is clear that a c (z ) > M(z ) := min Vi [fiy,) - c(z, >’,)] for all z s U. 
In view of the continuity of c, this yields that a c is locally lower bounded. 

We now prove that a c is locally upper bounded. Indeed, fix R > 0 and let jr 6 Cl,y e Y be such that 
|xo|, |x| < R. By the local Lipschitz property of c in x, i.e. 

I c(x,y) - c(x 0 ,y)| < C\x - x 0 | 

for some C = C(R) > 0 and for all |x[ < R, we have that 


Thus, 


s +1 ■ (y - xq) > P(y) - c(x 0 ,y) > /3(y) - c(x,y) - C\x - x 0 \. 


s + C\x - xo\ + t ■ (x - xo) + t • (y - x) > /3(y) - c(x, y). 
The definition of a, gives 


s + C\x - xq\ + t ■ (x - Xq) > a c (x). 


(3.7) 


(3.8) 


In particular, a c is locally upper bounded, hence locally bounded. 

Note now that y r (x) is a set valued function, and it is clearly closed and convex for each x e ST. To 
see the local boundedness of y c , use ( |3.6[ ) and let V be a small neighbourhood of x whose closure is 
in U. Since a c is bounded on V, there exists a constant C such that 


b ■ ( yi - z) >C, VzeV,i= 1,2,..., s, Mb e y{z) (3.9) 

which says that y c is bounded on V, thus locally bounded on O. To show that y, (x) is nonempty for 
any jefl, choose an approximating sequence {a,,} c R for (TcCy) and corresponding {b,,} c R d , in 
such a way that /?(y) - c(x,y) < b n ■ (y - x) + a„ and a„ \ a c (x). Now the above argument shows 
that {b„} must be bounded, hence its accumulation points must be in y c (x). 

We now show that a c is locally Lipschitz. Since a c is finite in Q, the above argument showing the 
local boundedness for a c can be repeated, giving ( |3.8[ > for any x, xo e il. .v = af xo) and t e y,.(xo); 

a'c(xo) + C\x - x 0 | + y c (x 0 ) • (x - x 0 ) > a c (x). 

By interchanging x and xo, we get 

la'c(x) - a c (x 0 )| < ((|y c (x)| V |y c (x 0 )|) + C) |x - x 0 |. 

Therefore, the local boundedness of y c implies that a, is locally Lipschitz in O. If furthermore 
x c(x, y) is Lipschitz (with Lipschitz constant uniformly in y) and y c is bounded, then the above 
estimate shows that a c is Lipschitz in Q.. 

As for /3 CC , it is clear that it is measurable and locally upper bounded. It is also clear that 

(a c , y c ,p cc ) e E m (c, Cl, R rf ) and fi < f3 cc on Y. (3.10) 

We now show that / 3 CC is locally bounded In O, by following a similar argument as for a c . First, let 
x 6 <T,y e Y,y' e Cl. By the local Lipschitz property of c in y, i.e. 

[c(x,y)-c(x,y')l < C|y-y'|, 
for some C = CAR) > 0, and for all |y|, |y'| < R, we see 
/3(y) < c(x,y) + y c (x) ■ (y - x) + a c (x ) 

< c(x,y') + y c (x) • (y' - x) + ay(x) + y c (x) • (y - y') + C\y - y'\. 

Now, since y' e Cl = IC(Y), one can choose {yi, ...,y s ) c Y such that 

y'eff = 7C({yi, ...,yj}) and W is open in R rf . 


(3.11) 
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Thus ( |3. 11 1 > implies, after putting v,’s in place of y and summing up with appropriate weights, that 

min /3(yi) < /3 cc (y') + C max \y t - y'|, 

yi y,- 

hence yielding the local lower boundedness, thus the local boundedness of j3 cc in Q. This completes 
the proof of the items (1) and (2). 

In order to establish (3), we first note that the inequality [i„ —S\ < a c on Q follows from the fact 
that (a c , y c ,Pcc) e E m (c, Cl, R d ). For the other inequality, notice that for each x 6 Cl and an arbitrary 
b > 0, there is x E G Cl and b e y c (x E ) (which we will simply write as y c (x E ) in the sequel), such that 

a c (x e ) + y c (x E ) • (x - x e ) + c(x E , x) - s < [3 cc (x). (3.12) 

Let a e (x) := a c (x e ) + y r (x e ) ■ (x - x E ) + c(x e , x), and consider for K, the function 

L(z) = a E (x) + y c (x E )(z - x) + c(x, z). 

Then, 


Pcciz) - L(z ) < a c (x E ) + y c (x E ) • (z - x E ) + c(x £ ,z) 

- (a c (x e ) + y c (x E ) ■ (x - x e ) + c(x E , x)) - y c (x E )(z - x) - c(x, z) 

< c(x E , z) - c(x E , x) - c(x, z) < <5 2 - 

Hence /3 cc (z) < L(z) + 6 2 , and therefore fi(z) < L(z) + 62 for z e Y by item (1). From the definition 
of a c , this implies a c (x) < a E (x) + 62 , and from ( |3. 1 2| ), we have a c (x) < /3 cc (x) + s + 62 - Since e is 
arbitrary, the proof of (3) is complete. 

To prove (4), first note that if X c Cl and (a, y, t 6) e E m (c, X, Y ), then the definition of a c obviously 
implies that a > a c on X. Now assume that T c T, T e S mt and in particular, for each x e Xr, 
x e 7C(r x ). Let x = 2, tyyi, X, t, = 1, f, > 0, y, € T A , and observe that 

P(yd - c{x,yi) = y(x) ■ (y/ - x) + a(x) (3.13) 

P(yd ~ c (x,yd < y c (x) ■ (y,- - x) + a c (x) (3.14) 

where the first identity is due to the definition of I '( tr , Y js) and the second inequality is due to the 
definition of p c = ( a c , y, ). Summing up the above relations with the weights f,, we get 

a(x) = ^ tj (p(yt) - c(x,y,)) < a c (x). 

i 

As a c < a on X, this shows o-(x) = a c (x) on X\ and hence y(x) c y c (x) on X[ . Then for x e X t , by 

subtracting ( |3. 1 3[ i from p,14| >, we get (y, (x) - y(x)) ■ (y - x) > 0 for all y e T x . But since x e IC{ r x ), 

this implies 

(Tc(x) - y(x)) • (y - x) = 0 for all y e T x . 

In other words, the projection of y(x) and y c (x) onto the affine subspace generated by T, are equal. 
Now note that ( |3. 14| ) obviously holds for p cc in place of p. Again by subtraction, we get p cc (y ) < P(y) 
for all y e T x . As the reverse inequality is already shown, we see that p = p cc on Y\ . Moreover, if 
(x,y) e T, in other words if (x,y) satisfies ( |3. 1 3[ >, then the above discussion implies that ( |3. 1 3| ) holds 
with ( a c , y c , p cc ). In other words, (x,y) G r (Q . rc:Ac) . 

For item (5), we first note that p cc is defined on and by item (2), we have p cc < p ccC c■ For 
the reverse inequality, fix z G IX 1 . Then by definition of p cc , there exist a sequence {x„} in Cl and 
b„ G y c (x„), n > 1, such that 

Pcciy) < c(x„,y) + b n ■ (y - x„) + a c (x„) for every y G R d , and 
Pcciz) = lim c(x„, z) + b„ ■ (z - x„) + a c (x„). 


This readily implies that p cc (z ) = Pcccc(z), completing the proof of the theorem. 


□ 
















STRUCTURE OF OPTIMAL MARTINGALE TRANSPORT PLANS IN GENERAL DIMENSIONS 


13 


Remark 3.3. Note that both costs c(x,y) = \x-y\ and c(x,y) = —\x - y\ satisfy the above hypothesis, 
and in both cases, i.e., c(x,y) = ±\x - y\, we have that 5\ = 0. Moreover, 62 < 2diam(Q) if 
c(x,y) = -|x- y|. 

On the other hand, if c(x,y ) = [x - y\, then 62 = 0, which means that a c = [f, on Q. In particular, by 
Theorem|3.2|(|5|, the duality theorem becomes 


f 

Jr 


R d xR d 


f 

J R‘ 


x - y\ dn ; n e MT(p, v) > = sup < I [5 d{v - p)\ [5 is martingale c-convex 


which can be seen as the counterpart of the Kantorovich-Rubenstein duality formulation in standard 
transport theory, whenever the cost is given by a distance function. 


Remark 3.4. (Localization) Let K be a compact set in O and let of and yf be the restrictions of a c 
and y c on K, then (af, yf, /if) e E m (c, K, R d ), where 

Pf c (y) := inf{c(x,y) + yf (x) ■ (y - x) + of (x)}. 

Consequently, erf is Lipschitz in K, and yf is bounded in K. Moreover, /if is Lipschitz (resp., 
locally Lipschitz) in W 1 provided v i-> c(x,y) is Lipschitz (resp., locally Lipschitz) in R rf . 

Indeed, from the definition of /if, the boundedness of y c on K and the local Lipschitz assump¬ 
tion ony h c(x,y) (uniformly in x), we see that /if is the infimum of local Lipschitz functions 
parametrized by x 6 K with the local Lipschitz constant uniform in x. This shows that /if is locally 
Lipschitz in R d . If in addition, c is Lipschitz, then by the same reasoning /if is Lipschitz in R rf . 


4. Extremal structure of a ocontact layer 


We first deal with the differentiability properties of an admissible triple (a, y,/f). The next lemma 
shows that essentially y is differentiable in an appropriate sense, wherever a is. This property will 
be crucial in the proof of Theorem |2.4| 

Lemma 4.1. Suppose x t-» c(x,y ) is differentiable at x whenever x + y, and assume that T is a set 
in Smt that is a c-contact layer for a triple (a, y,/l) e E m (c, Q, R rf ), where a : O —» R, /I: R rf —> R, 
and y : Q —> R rf . Fix x e X r, and let V be the vector subspace of R rf corresponding to the affine 
space V(Y X ), and assume dim(V) > 1. Assume there is s e V such that 

a(x') < s ■ (x' — x) + a(x) + o(\x' - x[) as x' —> x in V(Y X ). (4.1) 

Let proj v y be the orthogonal projection of the value ofy on V. Then a and proj v y have a directional 
derivative at x in every direction u e V. 


Proof By the duality assumption for the minimization problem, for all x' e Y1 and all (x,y) e T, 

c(x',y) + y(x') • (y - x') + a(x') > c(x,y ) + y(x) • (y - x) + a(x). (4.2) 

Choose a unit vector u e V and let x' = x + tit. Then (14.2b is rewritten as 


a(x + tu) - a(x) y(x + tu) - y(x) c(x + tu,y) - c(x,y ) 

-> -- (x + tu — y) + y(x) • u -if t > 0 (4.3) 


t t 

a(x + tu) — a(x) y(x + tu) — y(x) 


c(x + tu, y) — c(x, v) 

(x + tu —y) + y(x) ■ u ---— if t < 0 (4.4) 


Let us use the notation D Ul f(x) = ZLifLifil Now the assumption ( |4. 1 [ > says that 

lim sup D, „a(x) < s ■ u < lim inf D, t ,a(x). (4.5) 

40 ’ 'to 

Since x e int(conv(Y x j), there exists y\,...,yk eT,\ (x), pi,...,pu > 0, q\,...,qk > 0, Lp, = 1, 
I.qi = 1, t+ > 0, t- < 0, such that 


x + t+u - Lp,y, 
x + t-U = T.qiyi. 
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Note that the first term on the right side of ( |4.3| l and ( |4.4[ > is linear in y, so by summing up the y,’s 
with the weights p,’s or g,-’s, we get (and we write yi(x) := y(x) ■ u) 

D Lu a(x) > D, u yi(x)(t - t ± ) + C ± (t) if t > 0 (4.6) 

D l u a(x ) < D IM yi(x)(t - t ± ) + C ± (0 if t < 0 (4.7) 

Here C+(f), C_(f) are functions of t + 0, but have limits as t —> 0 by the differentiability assumption 
on the cost. Write C± = lim,_*o C±(f), respectively. 

By taking lim sup f , 0 in ( |4.6| ) and lim inf^o in ( ]4.7[ > and by recalling that t+ > 0. t < 0, we have 

lim sup D, u a(x) > (—4) lim inf D tu y i(x) + C+ 

40 ’ d° 

lim sup D t u a(x) > (-f_)limsupD, „y 1 (x) + C 
40 ’ 40 

lim inf D t u a(x) < (—4) lim sup D lu y j(x) + C+ 

'TO ’ 40 

lim inf D, u a(x) < (-t ) lim inf D, u y\(x) + C . 

40 ’ 40 

This and ( |4.5| ) combine to give 

lim inf D, u y\(x) > lim sup D, u y\(x) > lim inf D, u y\{x) 

'to ’ 40 ’ d° 

> limsupD,, u yi(x) > lim inf D, u y\{x), 

40 ' 'T° 

that is y t = y ■ u is differentiable at x in the direction u. Knowing this, we then take lim inf 40 in ( |4.6[ > 
and lim sup ,| 0 on ( |4.7[ > to get 

lim inf D Uu a{x ) > (-f+)V„yi(x) + C+ 

40 

lim sup D uu a(x) < (-f+)V„yi(x) + C+. 

40 

Combining this with ( |4.5[ >, we get the differentiability of a at x in the direction u. 

Next, choose any unit vector v 6 V orthogonal to u and let yi(x) y(x) ■ v. We want to show 
that V„y 2 (x) exists. We proceed just as before; for some k e N, there exists y\,...,yk e F x \ {x}, 
p 1 , ...,pk > 0, q\, ...,qk > 0,1 .pi = 1, Eg,- = 1, t+ > 0, t- < 0, such that 

X + t+v = T,piyi 
x + f_v = I g,y;. 

By summing up the y, ’s and the weights pf s or q' s as before, we get this time 

D,' U a(x ) > tD Uu y\{x) - t ± D, M y 2 (x) + C ± (t ) if t > 0 (4.8) 

D,, u a(x) < tD tu y\(x) - t ± D tll y 2 (x) + C ± (t ) if t < 0 (4.9) 

Taking lim sup,i 0 in ( |4. 8 [ > and lim inf 40 in ( |4.9| ) and recalling t+ > 0, t < 0, and the existence of 
lim,_*o D r , u y i(x), we see that 

V„a(x) > (- 4 ) lim inf D tM y 2 (x) + C+ 

40 

V„ar(x) > (-4) lim sup D t u y 2 (x) + C 
40 

V„a<x) < (—4) lim sup D t u y 2 (x) + C+ 

40 

V„o'(x) < (-4) lim inf D t u y 2 (x) + C_, 

/|0 

which implies differentiability of y 2 = y • v at x in the direction it. Now choose an orthonormal basis 
{«, v'i,..., v m } of V and write projyy = (y • u)u + E(y ■ v';)v,-. We observed that each component of 
projvy is directionally-differentiable. This completes the proof. □ 

Remark 4.2. For the maximization problem, we need to reverse the inequalities in ( |4.1| > and ( |4.2[ ). 
and then proceed in the same way. Hence, the lemma is proved. 
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We now restrict our attention to the cases c(x,y ) = ±\x - y\ in trying to describe the profile of a 
set T that is a c-contact layer. 

Lemma 4.3. Let V e Smt . ^ an open set in W 1 containing Xr, a : O — > R. and y : Q -a Mf be two 
functions. Let f : R d —> R be either 

/3(y) = sup {[a - >’| + y(x) ■ (y - x) + a(x)}; (4.10) 

or 

/3(y) = inf{|x - y| + y(x) ■ (y - x) + a(x)\. (4.11) 

xeQ 

Assume that T satisfies 

/3(y) = \x -y\ + y(x) ■ (y - x) + a(x) for all (x,y) e T. (4.12) 

If a and y are differentiable at x e Xr, then the closure T x coincides with the set of extreme points of 
the convex hull ofY x , i.e., T* = Ext(conv(r v )). 

Proof First note that, for any closed set A in R d , it is clear that Ext (conv(A)) c A. To show the 
reverse inclusion in our setting, we define the “tilted cone” 

£(x,y) = f x (y) = fy(x) := |x - y| + y(x) ■ (y - x) + a(x). 

The duality condition ( |4.12| ) with ( |4. 1 0[ > tells us the following: if (x, y) e E, then for all fell, 

fx'(y) < fx(y). (4.13) 

Or ( |4. 12| > with ( |4. 11 1 ) we get the reverse inequality. 

Note that since ffy) is continuous, the same inequality holds for all y e r v . This obviously implies 
that, if y e r x and x + y, then the gradient with respect to x vanishes: 

V^(x) = 0 (4.14) 

and in fact ( |4. 1 3[ ) also implies that if y e T x , then necessarily x + y. (If x = y, then the function 
f y (x) strictly increases as x moves along the direction V v [y(x) • (y - x) + a(x)].) We may call this as 
non-staying property or unstability, for the maximization problem. For the minimization problem, 
without loss of generality we already assumed that x i Y x , but in fact x t I A as well, by ( |4. 1 5[ i 
below. 

Now suppose the lemma is false. Then we can find {y,yo> £ r* for some s > 1 with y — 

Z; =0 /j,>-„ L f () pi = 1. Choose a minimum s such that all /?, > 0. Now taking directional derivative in 
the direction u = gives 

V„^,(x) = X u fy.(x) = 0 Vz = 0,1,..., s. 

We compute 

x — yj 

V u f y (x) = -—- • u + V„y(x) • (y,- - x) - y(x) ■ u + V„a(x). 

\x - y,l 

Then, by the linearity of y h-> V„y(x) • y, the equation V£ v (x) = 0 simply becomes 

i = V x ~ y> . 

P ' |x - y,| [x - y|' 

As j^—^ is a unit vector and all /?, > 0, this can hold only if all y, lie on the ray emanated from x. The 

minimality of 5 then implies that 5=1, hence (y,yo,yi) £ r v would lie on a ray emanating from x, 
which is a contradiction, once we prove the following claim: 

E v is contained in the topological boundary of the closed convex hull of F r . (4.15) 

Recall that here the topology is not the topology in FT 7 but the topology in V := V(\' x ). If our claim 
is false and assuming first that dim(V) > 2, we can find y e T v n IC(T X ) as a barycenter of a triangle 
joining 3 points yo,yi,y 2 in F r . But the above argument implies that yo,yi,y 2 have to be aligned. 

















16 


NASSIF GHOUSSOUB, YOUNG-HEON KIM, AND TONGSEOK LIM 


which is a contradiction. If dim(V) = 1, then as a e 7C(T V ), we can find {v, vo, Vi) £ Y x such that x 
and y are in the interior of the line segment v () v i. But then again by above, {>’,>'o,}’i 1 must lie on the 
ray (i.e. half-line) emanated from x, a contradiction. Finally, we cannot have dim(V) = 0 since this 
simply means that T t = (a), but as we already showed above that x $. F v in the case of maximization, 
while we already assumed without loss of generality that x i T A in the case of minimization. □ 

Finally, the following result follows immediately from Theorem |3.2| Lemmata 

Corollary 4.4. Let c(x,y ) = ±\x — y\ and assume I is a c-contact layer in Smt■ IfX r £ LI := IC( Y\ ) 
with Q being an open set in W 1 , then for £f - a. e. x in Q, the closure T A coincides with the set of 
extreme points of the convex hull ofY x , i.e., r v = Ext (conv(r t )). 


4.1 and 4.3 


5. Structure of optimal martingale supporting sets when the dual is attained 


The goal of this section is to prove Theorem 2.4 which shows that dual attainment in the opti¬ 
mization problem o> implies that any optimal martingale transport is concentrated on a c-contact 
layer, and therefore has a specific extremal structure. We start by collecting the properties verified 
by a well chosen concentration set of a martingale measure. The proof is given in Appendix (A). 


Lemma 5.1. Let n e MT(p , v) and let A cW 1 xK d be a Borel set with n(A) — 1. Then there exists 
a Borel set T £ A with 7r(T) = 1 such that the map x i—» n x is measurable and defined everywhere on 
Xr in such a way that: 

(1) r v = supp 7 T x for all x e Xr, 

(2) r € Smt, that is x e IC(Y x )for all x e Xr, 

(3) If we assume that p « £f, then I can be chosen in such a way that X r £ IC(Y\). 

(4) If in addition n is a solution of the optimization problem o then r can be chosen to be 
finitely c-exposable. 


This leads us to use the following terminology. 


Definition 5.2. Let n be a martingale transport plan in MT(//, v). We shall say that 

(1) T is a regular concentration set for n if T satisfies (1), (2), (3) in Lemma [5T] 

(2) r is a martingale-monotone regular concentration set for 7r (or simply I is martingale- 
monotone regular for n) if F also satisfies (4). 


As mentioned in the introduction, there is a dual formulation for problem 0. just like in the 
Monge-Kantorovich theory for (non-martingale) mass transport. 


Lemma 5.3. (see e.g. Q) Let p and v be two probability measures on Mf in convex order, and let 
c : R"' X —> R be a cost function that is lower semi-continuous, then 


“'I/ 

( JP/xR'' 


c(x, y) dn\ n e MT(p, v) 


sup 


f Pdv- f 

Jr<> jR d 


(5.1) 


adp ; (a, y,/3) e E m for some y 6 C/,(R rf , R d ) 


and the minimization problem is attained at some martingale transport n. A similar result holds 
for the cost maximization problem, provided c is upper semi-continuous, and E m is replaced by Em- 
Furthermore, 


(1) If the dual problem is attained, then there is a concentration set Y for n that is a c-contact 
layer. 

(2) Conversely, if G £ K. 0 ' X W 1 is a c-contact layer and n*(G ) = 1 for some n* e MT(p, v), then 
n* is an optimal martingale transport. 


Proof. For •HU see a. Let us show the items (1) and (2). Note that if the dual problem is attained 
at functions a , / 3 such that the triplet (a, y,jS) is in E m (c, W 1 ), then since 

/3(y) - a(x ) - y(x)(y — x) < c(x,y ) for all (x,y) el^x R d , 


(5.2) 
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p and v are the marginals of some optimal n in MT(p, v), and f y(x ) • (v - x) dn(x,y) = 0 (due to the 
martingale condition), we then have 


/ 

jR d xR d 


\/3(y) - a(x) - y(x)(y - x)\dn(x,y) 


f 

jR d xR d 


c(x, y)dn(x,y). 


It follows that 


/3(y) - a(x) - y(x)iy - x) — c(x, y) for n a.e (x, y) e 


(5.3) 


hence the equality holds on a concentration set T of n. 

Conversely, if G c x R d and jt‘ (G) = 1 for some n* e MT(p, v) and if there exists a triplet 
(a,y,fl) in E m (c, X G , Y (: ) with equality ( |5.3| ) holding on G, then n* is an optimal solution of the 
primal problem in (5.11. Indeed, let n e MT(/r, v) and let H be such that n(H) — 1. As p(X (: ) — 1 
and v(Yc) = 1, by restriction we can then assume that X H c X ( - and Y H c Y (: , hence by integrating 
with n, we get 

X c(x,y) dn(x,y) > I fi(y)dv(y)- I a{x)dp{x). 

d xR d jR d J R d 


(Again f y(x) ■ (y - x)dn(x,y) = 0 since n e MT(//,y)). However, by integrating (5.2) with n* and 
since we have equality on G, we get 

I c(x,y)dn* = ( {J3(y) - a(x) - y(x)(y - x)}dn* - I /3(y)dv - I a(x)d/j. 

jR d xR d J R d xR d JR d JR d 

This shows that n* is optimal. Hence, every martingale measure that is concentrated on a c-contact 
layer is optimal. On the other hand, there exist optimal martingale measures that do not concentrate 
on c-contact layers (01). □ 


This suggests that dual attainability is actually a property of the support of the optimal martingale 
transport and not of the measure itself. Now an obvious but important remark is that any subset of 
a c-contact layer is also a c-contact layer. The same holds for dual attainment in the martingale 
transport problem. Indeed, if n e v) and II is a Borel set, we denote by n B its restriction on 

B X R rf , and we let //«, v B be the first and second marginals of n B . Then we introduce the following: 

Definition 5.4. Let ti € MTIjj, v) be given, and let B be a Borel set. We say that an admissible triple 
( a , y,/3) e EB, R d ) is c-dual to n on B , if the following holds: 

I fi(y) dv B (y) - I a(x) d/i B (x) = I c(x,y)dn B (x,y). (5.4) 

J R d J R d jR d x R d 

If such a triple exists, then we say that 7 r admits a c-dual on B . Note that in this case, n B (= 
yu(B), that is, n B is concentrated on a c-contact layer. 

Now, we can deduce the following. 


Theorem 5.5. Let c(x,y) = ±\x—y\ and jibe a probability measure that is absolutely continuous with 
respect to the Lebesgue measure. If n € MT(p, v) is a solution of for either the minimization 
or maximization problem that admits a c-dual on a Borel subset B, then for p-almost all x e B, 
supp n x = Ext (conv(supp n x )). 


Proof. Let (a, y,{E) be a c-dual to n on B and let A be its contact layer. Then A contains the full 
measure (that is, p(B)) of n B . Apply Lemma 5.1 to get T c A in such a way that n B {T) = p(B ), 
T € Smt, A i c Q IC(Y[ ) and supp n x = T v for p a.e. x e B. Now since T is also a c-contact layer. 
Corollary |4.4| applies to get the claimed result. □ 


Remark 5.6. Note that the above theorem shows that Conjecture (1) is valid provided duality is 
attained locally. In other words, if for any x in the support of p, there exists a ball B centered at x 
such that the optimal martingale measure n admits a c-dual on B. This refinement will be used in 
the next section. On the other hand, there exists an optimal martingale measure where “local dual 
attainment” does not hold on any neighborhood. This can be seen with the following example given 
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Example 5.7. Let p - v be two identical probability measures on the interval [0,1], then the only 
martingale (say n) from p to itself is the identity transport, hence it is obviously the solution of the 
maximization problem with respect to the distance cost, and its support is T = ](x, x) : x 6 [0,1]}. If 
now {a,y,/3} is a solution to the dual problem, then 

/3(y) >\x-y\ + y(x) ■ (y - x) + a(x) Vx e [0,1], Vy e [0,1]; 

P(y) = \x ->] + y(x) • (y - x) + a(x) V(x,y) e T. 

The above relations easily yield that for any 0 < a < b < 1, we have y{a) + 2 < y(b), which means 
that it is impossible to define a suitable real-valued function y for a.e. x in [0,1]. 

6. When the marginals are in subharmonic order 

In this section, we consider a case where the dual martingale problem is attained -at least locally- 
which will allow us to apply Theorem |5.5| and verify that conjecture (1) holds in that particular case. 
We consider the following “balayage order” between probability measures, that is stronger and more 
natural than the convex order, at least in higher dimensions. We say that probability measures p and 
v are in subharmonic order, p <sh v, if 

J Rd (fdp < J,,, ip dv for every subharmonic function ip on R d . (6.1) 

For simplicity, we shall assume that p and v have compact support so as to avoid integrability issues. 
Since convex functions are subharmonic, it is clear that p <$h v => p <c v and that the two notions 
are equivalent in one-dimension. 

Note that if (B,), is a cZ-dimensional Brownian motion with initial distribution p and if v is the 
distribution of Bj where T is a stopping time such that (Bj,,), is a uniformly integrable martinagle, 
then p < S H v. Such stopping times are normally called standard. The converse is also tme and 
belongs to a family of results known as Skorokhod embeddings (e.g., see Obloj |[24l ). In other 
words, <[6T} is essentially equivalent to 

p ~ Bq and v ~ Bj for a (possibly randomized) standard stopping time T. (6.2) 

We now consider the Newtonian potential (or simply, potential) P ' M of a probability measure p with 
compact support, that is 

P/Ax) = 1 f \x - y\ 2 ~ d dp(y), 

d( 2 - d)w d Jw 

in such a way that in dimension d > 3, we have Alf = p (in the sense of distributions). Note that 
( |6. 1 [ > then implies that 

P M (x) < Py(x), Vx e R d . (6.3) 

The converse is also true at least for d > 3. See Falkner m. 

Finally, note that if we consider an elliptic operator L, = Y,ij aij{t)djd j corresponding to a one- 
parameter family of positive matrices (a, 7 (f)), t > 0, and if p, p, are measures with densities p, p, 
respectively, where 

f d,p, - L,p, = 0 for t > 0 and in R d , 

\ (6.4) 

[Po = P. 

then one can easily verify that p <sh Pi- Actually, one can show that 

P M (x) < P„(x), Vx G R d . (6.5) 

The importance of such a strict inequality will be clear thereafter. The following is the main result 
of this section. 

Theorem 6.1. Assume p <$h v where p, v are probability measures with compact support on R' 1 ' 
such that p « £ d . Assume the function P v — P^ lower semi-continuous and consider the open set 
U {x G R d \P v (x) — Pfi(x) > 0). If n e M(p, v) is an optimal solution for the minimization problem 
o where the cost function is either c(x,y) = |x - y\ or c(x,y ) = —|x — y |, then: 

( 1 ) For each x e U, there exists a ball B centered at x such that n admits a c-dual on B. 
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(2) For p - a.e. x e U, supp7r v = Ext(conv(supp7T x )). In particular, Conjecture (1) holds if 
p(U) = 1. 


Remark 6.2. The assumption that v is compactly supported can be replaced with appropriate decay 
conditions on P v - Ij, and V(P v - P M ). In particular. Conjecture 1 holds for p and v - p, from the 
diffusion example in ( |6.4| > for d > 3, if the initial measure p is absolutely continuous and compactly 
supported. Note that the 2-dimensonal case is true in full generality, that is when the marginals are 
simply in convex order (See Section 7). 


Proof of Thereout \6.1\ Denoting E m = E m (c, R rf ,R J ), we have from Lemma 5.3 


that 


/ := min -I I 

Ur 


iL d X«. d 

sup 


c(x,y)dn; n e MT(p,v ) 


f wv- r 

Jf d Jw 


adp; (a,y,P) e E,„ for some y e C/,(R rf , R rf ) 


( 6 . 6 ) 

(6.7) 


Let n be an optimal solution for the minimization problem, and let L be a martingale-monotone 
regular concentration set for n (as in Deflation |5.2[ i. Fix a bounded open set Q which is sufficiently 
large such that supp(/r) c Q. We shall show that for each aq e U Pi Q, there exists a ball B = B(xq) c 
U Pi Q centred at x, such that n has a c-dual on B. 

For that consider a maximizing sequence for the dual problem, that is admissible triples ( a n ,y n ,p n ) e 
E m (c, R d , R d ) such that 


l = lim I p n dv - I a n dp. 

J J 


( 6 . 8 ) 


In view of Theorem 


3.2 


and remark 


3.4 


we can assume that the triplet ( a n ,y n ,fi„) e E m (c, Q,R“), 


that a is Lipschitz in O, y is bounded in O, and that 


P(x) < a(x) < p(x) + 6 for all a e Q, (6.9) 

where 0 < 5 < 2diam(0). Note that c) = 0 if c(x,y) = \x - y|. 

We consider the convex function 


Xn(y) := sup {-a„(x) - y„(x) ■ (y - x)}. 

xeQ. 

Since a„ and y n are bounded and the set Q is bounded, the functions x» are Lipschitz on R r/ . Note 
also that by adding a sequence of affine functions L n (since L n (y) = L„(x) + WL„(x) ■ (y - x)) the new 
sequence (a„ + L n fj„ +L n ,y„ + VL„) will still have the same properties. By adding appropriate affine 
function L„ and their gradients to the triple (a„,y n ,p„), we may therefore assume that 

Xn(x) = 0 and Xn > 0 for every n. 

We now show that a subsequence of a n ,y„ converge locally in U Pi Q. We hrst establish suitable 
estimates on^„. Consider the Lipschitz function q(y) := sup ten c(x,y) and note that 

-«„(>’) < Xn(y) VyeQ and Xn(y) < q(y) ~Pn(y) VyeR d . (6.10) 


Hence, 


0 < J' Xn(dv -dp)< - J p„dv + J" 


a n dp + Ci, 


where Ci = f q(y)dv(y) < oo, since q is Lipschtiz and v has finite first moment. Since ( a n ,y n ,p „) is 
a maximizing sequence, then for all sufficiently large n. 


f 


Hence, 


0 < I Xn(dv - dp) < -l + C i + 1 =: C 2 . 


c 2 > J' Xn(dv - dp) = J' Xn &(Py - Pfj) = J' A X„(Py - P,j) 


(6.11) 
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where Ax„ is the distributional Laplacian of the convex function \n- For the second last equality 
note that A/ J /( = /./, A P v = v, and for the last equality note that Xn is convex Lipschitz and P v - P^, 
V(P v - Pfj) decays to zero at infinity by assumption, enabling us to integrate by parts. 

Now fix xq € U Pi Q and pick a closed ball B : = B r (x o) c (J n Cl of radius r, centered at xo . Since 
P v - P ti is lower-semicontinuous and strictly positive on U, we have e B '■= min/jI’P,, - P fJ ] > 0 which, 
in view of (|6. 11 1), implies that 



Now, modulo approximating it by smooth convex function, we can assume that Xn is smooth and 
apply Proposition |B.l| to conclude that Xn is bounded in a smaller ball B,'(x o), uniformly in n. In 
view of ( |6.9| ) and ( |6. 1 ()[ >, the uniform boundedness of Xn then implies the uniform boundedness of 
a n ,p n on B r '(xo). Moreover, since 

-a„(x) - y n (x) ■ (y-x) < x„(y) < C, Vx,y e B r ,(x 0 ), 

we also hnd that y n is uniformly bounded in n on a smaller ball B = B r "(x o), r" < r' < r, in such a 
way that the sequences (a,,, y n ,p n ) n are all uniformly bounded on B. 

Apply now Komlos theorem, which states that every L 1 -bounded sequence of real functions has 
a subsequence such that the arithmetic means of all its subsequences converge pointwise almost 
everywhere. Since the arithmetic means of a n ,/3 n , y„ also yield a maximizing sequence of admissible 
triples for ( |6.8| >, we can therefore assume that the original functions a„, f3„ and y n converge -£ rl a.e. 
in B to, say, a , /i, and y on X c B where -C rl (B \ X) = 0. Notice that these limits are bounded in X. 

It is not clear, however, that this triple (a, y,P) will give the desired one, especially because /3 is only 
defined in X , not in . We thus proceed as follows. Define 


Px,,i = inf{c(x, >’) + a„(x) + y„(x) ■ (y - x)}. 

xeX 

Notice that since a„, y n are bounded in X uniformly in n and y c(x, y) is Lipschitz in R d with 
uniformly bounded Lipschitz constants for x e X, we immediately see that the function vel'A 
/3 x ,„(y) is Lipschitz (uniformly in n) and is uniformly bounded on each compact set. Therefore, 
there exists a subsequence, which we still denote by fix,,:, that converges to a Lipschitz function (3 X 
uniformly on each compact set in W 1 . Moreover, from the definition of fix.n, the triple (a„,y n ,Px,n) 
satisfy 

Px,n(y) ~ a n (x) - y n (x)(y - x) < c(x,y) V(x,y) eXxR d . 


Thus (a„,y n ,j3x,n) 6 E m (c,X,R d ). Also, taking the limit as n —> oo, the above inequality still holds 
in the limit, and so the triple (a,y,/3 x ) e E m (c,X,R d ). 

To show that the triple (a, y,/3 x ) is a c-dual to n on B (in the sense of Definition 5.4 1 , it remains to 
verify ( |5.4[ i. For this, observe from the definition of fj Xj , that /j„(y) < p Xn {y) for all v e R rf . Thus, 

J "Pndv B - J " a n dn B < J' Px,ndv B - J' a„d/j B < J' c(x,y)djT B (x,y). 

Noting that the maximizing sequence of admissible triple (a„,y„,p n ) for is also a maximizing 
sequence of admissible triple for n B , i.e., 

lim J Pn(y) dv B (y) - J " a„(x) d/j B (x) — J " c(x,y) dn B (x, y). 


we therefore have that 


lim Px,n(y) dv B (y) - a„(x)d/u B (x) = c(x,y)dn B (x,y). 

J J J 


To bring the limit inside the integrals, recall that Px,„ is uniformly Lipschitz (in n) and a n is uniformly 
bounded, /.i(B \ X) = 0 and v B has finite first moment. Thus, by the dominated convergence theorem, 

f ~Px(y)dv B (y)- I a(x) d/u B (x ) = / c(x,y)dn R (x,y). 
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Therefore, the triple ( a,y,/3 x ) is a c-dual to n on B, proving the item (1). Then by Theorem 5.5 for 
a.e. x e B we have that supp7r x = Ext(conv(supp7r x )). As U can be covered by countably many 
such balls B, for // a.e. re 1/we have that supp n x = Ext (conv(supp n x )), proving the item (2). □ 


7. A CANONICAL DECOMPOSITION FOR THE SUPPORT OF MARTINGALE TRANSPORTS 

We have shown in the last sections that Conjecture 1 holds whenever the dual problem is (locally) 
attained. In this section, we shall decompose an optimal martingale transport n into components on 
which an induced martingale transport problem is defined in such a way that its dual problem is 
attained. For that, we shall first associate to any Borel set T e Smt a unique irreducible convex 
paving (D. We then show that if every finite subset of T is a c-contact layer (a property satisfied by a 
concentration set of an optimal martingale measure), then every subset T c = E Pi (C x l:' y ) where C 
is a component of the convex paving <T, is a c-contact layer. 

6.1 Irreducible convex pavings associated to martingale supporting sets: Let T be a Borel set 
in Smt- We start by defining an equivalence relation on X r . For each x e X := X r , we define 
inductively an increasing sequence of convex open sets (C n (x))„ in the following way: 

Start with the trivial equivalence relation x ~o x! iff x = x'. Let Co(x) IC(Y X ) and recall that 
if r.v = {x}, then Co(x) = {x}. Now define the following equivalence relation on X: x X if there 
exist finitely many x\ , ...Xk in X such that the following chain condition holds: 

C 0 (x) n C 0 (xi) + 0, 

Co(xj) n Co(x,+i) + 0 V; = 1,2, ...k - 1, 

Co(xk) n Co(x') + 0. 

We then consider the open convex hull: 

C\(x) := IC[ U Co(x')]. 

x'~\x 

Note that x ~i x' implies Cj(x) = C\(x'). Unfortunately, the convex sets Ci (x) do not determine 
the equivalence classes. In particular, they may not be mutually disjoint for elements that are not 
equivalent for ~i. So, we proceed to define ~2 in a similar way: x ~2 X if there exist finitely many 
Xu ...Xk in X such that the following chain condition holds: 

Ci(x) n C|(xi) + 0, 

C | (Xj) fi C | (x i+ 1 ) + 0 Vz = 1,2, ...k - 1, 

C\(x k ) n Ci(x') + 0; 


and we set 

C 2 (x) := IC[ U C,(x')]. 

Again, ~ 2 is an equivalence relation and one can easily see that 

• X ~1 X => X ~2 X 

• x ~2 X => C 2 (x) = C 2 (X ) 

• C|(x) c C 2 (x). 

But still, the sets C 2 (x) may not be mutually disjoint for non-equivalent x's. We continue inductively 
in a similar fashion by defining equivalence relations ~ n for n = 1,2,... and their corresponding 
classes 

C„(X) := IC[ U C„_ 1 (x')]. 

x'~ n x 
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It is easy to check that we have the following properties for each n, 

X ~ n x' => X ~„+i x' 
x ~ n x C n {x) — C n (x ) 

C„(x) £ C„+i(x). 

Finally, define the equivalence relation 

x ~ x' if x ~ n x' for some n, 

and its corresponding convex sets 

C(x) := lim C n (x) = U“ 0 C„(x). (7.1) 

n —>oo 

Now, we show that V F = {C(x)\ xc _x is an irreducible convex paving for F. 

Theorem 7.1. The canonical relation ~ on Xr and the components (C(x))x<=x r satisfy the following: 

(1) x ~ x 1 => C(x) = C(x'), and x * x 1 => C(x) Pi C(x') - 0. 

(2) C(x) are mutually disjoint, that is either C(x ) = C(x') or C(x) Pi C(x J ) = 0. 

(3) / eXfl C(x ) if and only if x' ~ x. 

(4) <I> = {C(x)} xe x is an irreducible convex paving for T. 

(5) C n (x) = Fy] for n > 0 and C(x) = /C^Ux'-xT*']- In particular, T x £ C(x). 

Proof. The fact that x x' => C n (x) = C„(x r ) gives the first part of (1). If there exists az e 

C(x) fi C(x'), then there is N such that z e Cn(x) Pi C^(x'), implying x ~jv+i x J and verifying the 
second part of (1) of which (2) and (3) are obvious consequences. 

To prove (4), let V F be any convex paving of I and let z„ x e A) , I) e V F he such that C(z) Pi D(x) 4 0. 
We must show that C(z) £ D(x). We claim that, for any n > 0, 


(*) 


C„(z) Pi D(x) 4 0 => C„(z) £ D(x), for every z, x e X T . 


Indeed, it is true for n - 0 by definition. Assume that (*) is true for some n, and suppose C„ + \(z) Pi 
D(x ) 4 0. Note that C„(z) £ D(z), and so if w ~ n+ \ z, by (*) we have that C„(w) £ D(z). As 
C„+\(z) = /C[U W ~„ +1Z C n (w)], this readily implies that C„+i(z ) £ D(z), but then D(z) Pi D(x) 4 0 and 
hence D(z) - IXx). This proves (*) for every n > 0. Now if C(z) Pi D(x) 4 0, then for all large n 
C n (z ) Pi D(x) 4 0, hence by (*) we get that C„(z) £ D(x). Therefore C(z ) £ D(x) which proves the 
irreducibility of O. 

For (5), let (A,), e / be any family of sets in T d , where I is an index set. Then it is easy to see that 

IC(Ai) = IC(CC(Aj)), and 

CC(J cc(Aj)) = ccdjAd = CC(J /c(a,-)). 

iel iel iel 

But note that A £ B does not imply IC(A) £ IC(B) in general. The above implies in particular 

/C([jA I -) = /C([J/C(A i )). 

iel iel 

In addition, a simple induction shows that for every n > 0, we have 

C„(x) = I Cl [J F V ']. 

x'~ n x 

Indeed, it is true for n = 0 by definition. Suppose CJx) = /C[lJx'~ x fv]- Now by definition, 

c n+ i(x) = ici J c„(x')] = ic[ J ic( J rx»)] = ic[ J ( U r^)]. 


But Ux'~„ + ,x Ux"~„x' r x" = Ux'~„ + 1 x r v, hence, C„ +l (x) = IC(\J X 


, r t <), completing the induction. 
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Finally, we proceed as follows: 

C(x) = IC[C{x)\ = /C[(J C„(x)] = /C[(J IC( [J IV)] 

n> 0 

= /C[(J J r,]=/ciyr,], 


n> 0 x?~ n x 


. be a cost function 


«>0 x'~ n x 

which completes the proof of (5) and the theorem. 

6.2 When irreducible components are c-contact layers: Let c : 

on which we make no assumption. Our aim is to prove Theorem 2.11 which will follow from the 
following. 

Theorem 7.2. Let T € Smt be c-finitely exposable. If ( i> is the irreducible convex paving ofT, then 
for every convex component C in <1>, the set T fi (C x Mf) = T C\ (C x C) is a c-contact layer. 

First, we prove the following lemma. 


Lemma 7.3. Let T € Smt be c-finitely exposable, and denote X X\. Fix xq e X and set 
G := r n (C(xq) X M. d ), where C(x o) is the component of the irreducible convex paving <f> of F that 
contains xq. Then, for each y € Y c , there exists a compact interval K y C R such that any finite subset 
H C G is a c-contact layer for a triplet (a, y,fi), where /3(y) € K y for all y E Yu- 

The above lemma is essentially saying that there is some uniformity in the way c-admissible 
triplets can expose finite subsets of G as c-contact layers. This control on the fi component of the 
c-admissible triplets will allow us to use Tychonoff’s compactness theorem to deduce that the whole 
of G is a c-contact layer. 


To prove Lemma 7.3 we first give an idea about the degrees of freedom we have in choosing 


fi. First, note that if fi is c-admissible for G (meaning that there is a, y such that G c T ( a ,yj 3 )) and 
L : R d —> R is an affine function, then fi-L is also a c-admissible for G. Letting m = dim(V(T g)X we 
can find {yo,...,y m } c T c such that V({yo, ...,y m }) = V(Yc), i.e. {yo, ...,y m ) constitute vertices of an 
///-dimensional polytope in V(Y (: ). Now for a given c-admissible function fi for G, let L : V(Y ( ;) —> R 
be an affine function determined by L(y,) = fiiyfi for i = 0,1, ...,m. The function fi’ .- fi - L then 
satisfies fi'iyi) = 0 for all i = 0,1,.... m, which means that we have m + 1 freedom of choice on 
the value of fi. In other words, if we set K y . = {0} for i = 0,1,..., in, then we can find fi' such that 
fi'iyi) e K y: for each y,. Now, we want to observe how the initial value of fi can control its values 
at other points y. We shall see that the control of the value of fi propagates well along a given chain 
inside the equivalent class C(jto)- 


The proof of Lemma 7.3 is involved, and requires several key steps. To clarify the idea, we 
consider first the special case c = 0 where we can establish a complete control on the dual functions. 

Lemma 7.4. Let G e Smt and assume that it is a 0-contact layer for a triplet (a, y,fi), that is 

fi(y) > L x (y) Vx e X G ,y e Y G (7.2) 

fi(y) = L x (y) i(x,y) e G, (7.3) 

where for each x, L x is the affine function 

L x (y) := y(x) ■ (y - x) + a(x). 

Then, L r = L x < on V(C(x)) whenever x ~ x!. 

Note that ( |7.3[ > says that if we have control on L x , then we have control on fi for all y 6 G x . In 
particular, Lemma |7~4| implies that if L x = 0 (we can choose such L x without loss of generality) then, 
L x , - 0 oil V(C(x)) for all x' e C(x), thus a(x') = 0 for all x' e C(x) and fi(y) - 0 at each y e G X ’. 


The above lemma is a consequence of the following proposition. 
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Proposition 7.5. Let L\,Li be two affine functions on R rf , and let S i, Si be sets in R' jf . Suppose that 
L\ < Li on S i, and L 2 < L\ on S 2 , and that IC(S 1 ) Pi IC(S 2 ) + 0. Then, L\ = L 2 on V(S 1 U 5 2 ). the 
latter is the minimal affine space containing the sets S \ and S 2 . 

Proof This follows from two facts: 

(1) For affine functions, L < L' on a set S implies L\ < L 2 on conv(S). 

(2) If two affine functions L, L' satisfy L < L' on a set S and if moreover, L(z) = L'(z) at some 
interior point of convf.S’), then L - L' on convf.S - ), thus on ViS). 

Indeed, apply (1) to the case L = L\, L' = L 2 , and S — S 1 , and also to the case L = Ln, L' = L\, and 
S - S 2 - We get Li = L 2 on convfS 1 ) n convfS 2 )- Now, from the assumption, IC(S 1 ) fi IC(S 2 ) + 0, 
and also obviously IC(S 1 ) n IC(S 2 ) £ IC(S ,), i = 1,2. Using (2) we then get that L\ = L 2 on both 
convf.S,), i = 1,2. From this the assertion follows. □ 

Proof of Lemma \7.4\ First note that for each x,x' e X — X (: , conditions .2) and ( |7.3[ > yield that 
L X ’ < L x on G x , and L x < Ly on Gy. 

Now to prove the lemma, it suffices to show that for each n e {0,1,2,...,}, 

if x ~ n x' then L x = Ly on V(C„(x)). (7.4) 

Here, by x ~o x' we mean x = x'. We do this inductively. Our induction hypothesis is ( |7.4[ ) together 
with 

L z < L x on C n (x), for each zel and x e X. (7.5) 

For n = 0, ( |7.4[ i is trivially satisfied and ( |7.5| ) follows from ( |7. 2[ > and ( |7.3[ >. Now, assume that ( |7.4[ > 
and ( |7.5| > hold for all n < k. For n = k + 1, if x ~ k+ \ x' then there are x = Xo, x \, ...,x m = x for 
some m, such that Q(x,) Pi Cf x, + \) + Q) for each 0 < i < m - 1. From this, and using ( |7.4| > and 
m we can apply Proposition |7.5| with the choice L\ = L x .,Lo = L Xj+1 , S 1 = Q(x,), S 2 - Q.(x,+ i), 
and see L Xj = L Xj+l on V(Ck(Xi) U C*(x,+ 1 )), for i = 0,.... m - 1. Similarly, repeated application of 
Proposition |7.5| eventually yields that L x = L x . = Ly on each C*(x,), for i = 1,.., m. Therefore, 
L x = Ly on U;Cjfc(x;), thus, on V(\J "' =0 Q(x,)). This holds for any x ~k +1 xf, thus by applying the 
result to all z ~k +1 a ~k +1 x', we also see 

L x = Ly on V(U z - kxlX C k (z)) = V(C k+x (x)), 

verifying ( |7.4[ ) for n — k + 1. For ( |7.5| , for each z € X, from the assumption for n < k and 
applying ( |7.4[ i, we have L. < Ly = L x on C k (xf ) for all x! ~*: + i x. For the affine functions, this 
implies L z < L x on C k + i(x). This completes the induction argument, so the proof. □ 

We now consider the case of a non-trivial cost c. We first establish a more quantitative version 
of Proposition |7.5| 

Proposition 7.6. Let L\,Lz be two affine functions on R rf , and let S 1 ,5 2 be sets in M, d . Suppose that 

• L\ < L 2 + 5\ on S and L 2 < L\ + 62 on S 2 for some constants 6i,5z > 0; 

• there is a point z in IC(S 1 ) Pi IC(S 2 ). 

Then, \L\ - Ln\ < C on conv(5 1 U Sz)- Here, C = C(z,Si,S2,di,(52) < 00 as long as z stays in 
the interior IC(S 1 ) Pi ICiSz), though as z gets close to the boundaries d (conv(S ,■)), i = 1 or 2, the 
constant C may go to + 00 . 

Proof. First, convexity and linearity imply that for each 5,6' >0 we have the following: 

(1) For affine functions, L < L' + 5 on a set S implies L < L' + 6 on conv(S). 

(2) If two affine functions L, L' satisfy L < /,' + c) on a set S and if moreover, L(z) > L'{z) - 5' 
at some interior point z of conv(S), then \L - L'\ < C — C(z,S,6,6') on conv(S). Here, the 
constant C < 00 depends only on 5,6' and the ratio between the minimum distance from z to 
d (convf.S’)) and the maximum distance to 3(conv(5’)), though, as z gets close to d (convf.S’)), 
the constant C can go to + 00 . 
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Now, apply (1) to the case L — L\,lf = Z/j, and S = Si, and also to the case L = Lo, L' = L\, and 
S = S 2 - Thus, we get | L\ - L 2 I < max(di,c> 2 ) at the point z of IC(S 1 ) n IC^Sf)- Now, apply (2), to 
get | L\ - L 2 I < C on both conv(5,), i = 1,2, where C = C(S 1 , 82 , 61 , 62 ) < Applying (1) again, 
we have \L\ - L 2 I < C on conv(S 1 U S 2 ), completing the proof. □ 


From now on, we consider only the maximization case, since the minimization case is the same 
by replacing c(x,y ) with -c(x,y). We now introduce the following notation. 


Definition 7.7. Let G 6 Smt an d let II c G be a c-contact layer for a triplet {a,y,f}. For each 
x e Xh, consider the affine function 

L x ( y) = y(x) ■ (y - x) + a(x). 

The superscript H indicates that L x arises from a c-admissible triplet for H. The fact that H is a 
c-contact layer for a,y,p can be written as: 

P 00 - c(x,y) > L H x (y), Vx G X H ,y g Y h , (7.6) 

P(y) - c(x, y) - L%), V(x, y) £ H. (7.7) 

For an a ffine space V, we write for x, x' e X H , 

L**L h k , on V 

if there is a bounded set S with V = V(S) and a constant M — M(c, F[,S) depending only on H , the 
cost function c and the set S, such that for every choice of a c-admissible triplet {a,y,/3 ) making H 
a c-contact layer, we have 

\L X - L x ,\ < M on the setS. (7.8) 

We say Lff « L x , at z, if we have ( |7.8| ) for S = {z}. 


An immediate observation is that for x, x', x" e Xh, 

whenever L H y ~ L H . and if], ~ L H x „ on V, then Lf ~ L u „ on V. 
Also note that if H' c H, we necessarily have for any x, x' e Xw, 


L H f 


L% on V ■■ 


L* on V. 


(7.9) 


We shall n ow p rove the analogue of Lemma 7.4 in the case of a general cost. We shall again use 
Proposition |7.6| to establish a propagation of control on the affine functions L x ’s, along an ordered 
chain of intersecting convex open sets. But, since c is not trivial anymore, the control on if] can be 
done only in finite steps, since the errors (the constant C in Proposition |7.6[ ) can accumulate. 

Lemma 7.8. Set G := F n (C(xo) x R rf ) and suppose x, x! e Xc (i.e. x ~ x'). Then there exists a 
finite set H c G such that x, xf € Xh and Lf] ~ L 1 ], on V(C(x)). 


Proof First observe that it suffices to prove 


Claim 7.9. Suppose x ~ x’ and z e G x >. Then there exists a finite set II c G such that x, x' € Xh, 
and Lfi a L x , at z. 

The lemma follows when we apply this claim to a set of finitely many (u,, v,)’s, i = 1,..., m, in G 
(so x ~ x' ~ up with V(C(x)) = VUvf'ffi, and use .9) . 

We show this claim using induction on n = 0,1,2,3,.... Our induction hypothesis is if x ~ n x' and 
z e G X ’, then there exists a finite set // c G such that 

(11) x, x' e X H , z 6 Y h and Y H £ C„(x); 

(12) L H X » L x , on V(Y H f, 

(13) for each finite set F c G with H c F, and for w e X F , there is a constant C = C(H, w) 
depending only on H and w such that 

L f w <L f x +C on Y h . 
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Notice that the claim follows if we verify (il) - (i3) for each n, since in particular, the values of L H X 
and L h k , at z is estimated from (i2). 

We proceed by induction, starting with n — 0 and assuming x ~o x' (i.e. x = x') and z e Gy. Choose 
then H = {(x, z)\. Then, (il) and (i2) are trivially satisfied. Moreover, (i3) holds from ( |7.6[ ) and ( |7.7| ). 
where the constant C is estimated by the value c(w,z ) - c(x, z). This completes the case n = 0. 
Suppose now the induction hypothesis holds for n. Assume x ~„+i x' and z e Gy. Then, there 
is a finite chain of C n (x k ), k = 0,1 Xq = x, x m = x', such that C„(x k ) n C n {x k+ \) V 0 

for each k. Recall C„(x) = IC[[Jy„ x Gy], Thus for each C n (x k ), it is possible to find a finite set 
Jk {( u‘. , vi )™‘| | c G such that Xk ~ n u' k for all i and IC(Yj k ) is a good approximation of C„(x k ), i.e. 
Y Jk c C n (x k ) c V(Y Jk ) and IC(Y Jk ) n IC(Y Jt+l ) + 0. Also, we can let z e Y Jm . 

Now, apply the induction hypothesis for n to each x k , u\, x k u\ and find a finite set //' c G that 
satishes (il)-(i3) for x = x k , x' = u' k , z= v' k . and H = //). Let 

H k :=(J H ‘ k 

i 

Then, from (i3) for HV s, we also have (i3) for H = H k and x = x k . Here, the point of considering 
H k is Y h c Y Hk c C n (x k ), so V(Y Jk ) = V(Y Hk ), hence IC(Y Hk ) n IC(Y Ht J * 0 as well. 

In order to verify the induction hypothesis for n + 1-th step, let 

H:=[jH k 

k 

We will show properties (il)-(i3) for this set H. From the construction, x, x' € Xq, z € Ya, and since 
C n {x k ) c C n+ 1 (x k ) = C n l i(x), (il) readily follows. For (i2), apply the induction hypothesis (i3) for 
H k s to Proposition |7 .6 1 iteratively for the pairs y h, and Y Hl , Y Hl U Y Hl and Y H} , ... , Y H , U .... U Y Hk 
and Y Hm , so on. Then we see the estimate ( |7.8[ i holds for S =Y S , thus, 

Lf ~ ss ... at L” m i * Z" on V(Y h ), (7.10) 

verifying (i2). 

For (i3), let F he a finite set containing H and let w e X k . Then (i3) for each H k gives that L F W < 
L F k + C k on Y Hk , k = 0,1,..., m. Now applying ( |7. 1 ()[ > and recalling ( |7 ,9[ >, we conclude that there is a 
constant C = C(H, w) such that 

L F W <L F + C on Y b . 

This completes the induction, and the proof. □ 


Proof of Lemma 1731 Recall that we fix xo e X and let G T fl (C(xo) x ) and V V(Y<j). Let 
m = dim(V). Then we can find 


J := {(M„v ; )ir =0 £ G such that vavil^o) = V. 


Define the initial choices K Vj = {0}, i = 0,1,..., m. We want to define the Ky s to be compatible with 
these initial choices. For y e Yq, choose x(v) e X (: such that (x(y), y) e G. By lemma|73](especially 
see Claim 7.9 1 , fory e Y c . we can choose a finite set H(y ) such that J U {(x(y),y)} c H(y ) and 


r H(y) _ jH(y) 
x(y) ~ 


at Vi = 0,..., m. 


In particular, there exists a constant M, depending only on y and H(y) -but not on the choice of the 
c-admissible functions for which H(y) is a c-contact layer- such that 

I L^j( Vi ) - L^\ Vi )\ < M, V i = 0, ...,m. (7.11) 

H(y) being a c-contact layer for some triplet (a, y.fi), we can by subtracting an appropriate affine 
function from /3, assume /i(v,) = 0. This yields that 

[5{y) = c(x(y),y) + L^(y). 
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Since L x <y> is affine and y({v;}” 0 ) = V, the value L H J^{y) can be computed from the values L H J^{Vi). 

Hence by ( |7.11| , the values L^y ] (vj) give an estimate of /3(y). Notice that the c-contact property 
yields that 

L^\vd =P(Vi) - c(ui,Vj ) = -c(uj,Vi). 

Thus, there exists a constant N = N(y) such that if /i is a c-admissible for H(y) and if (j(vj) = 0 for 
all i, then -N < fi(y) < N. We set K y = [~N,N], 


To get the claim in Lemma 7.3 we let H be any finite set and denote Yh = {yi, ...,y s }. Let 

H*=H U H(y{) U ... U H(y s ) 

Now choose (3 to be c-admissible for H* with (tty,) = 0 for all i. Since /? is also c-admissible for 
/-/(v ; ), we have P(yf) e K y for all j — 1,..., ,y. Finally, note that [3 is also a c-admissible for H, 
concluding the proof. □ 


Proof of Theorem L2 As before, let G = T fi (C(xa) x Fi d ) and let V V(Y C ) be the ambient 
space. We first find the desired function (3 : Yq —■ > R from the compactness argument already used 


in 0. Indeed, define K := n ye y K y , where the K y ’s were obtained in Lemma 17.31 This is a subset 


of the space of all functions from Y (: to R. In the topology of pointwise convergence, K is compact 
by Tychonoff’s theorem. Now we claim that, for any finite H c G, the set 

'Y H {J3 e K : /3 is c-admissible for H) 

is a non-empty closed subset of K. Indeed, that T# is non-empty follows from Lemma [7.3| since 
every finite subset of T and hence of G is a c-contact layer: if necessary, one can extend the /3 found 
in Lemma 7.3 -and originally defined on Yh— to K (; , by simply letting (3(y) = 0 for v Yh- 
To show that 'Y H is closed, let {/3„} be a sequence of c-admissible functions for H , and suppose 
f3 n —> (3 pointwise on Yq. We need to show that /3 is also c-admissible for H. But for each n, we have 
functions (a n ,y„) such that the following relation holds: 


A,00 - c(x,y ) > y n (x) ■ (y - x) + a n (x) Vx e X H ,y 6 Y H 
Pn(y) - c(x,y) = y n (x) ■ (y - x) + a n (x) V(x,y) € H. 


(7.12) 

(7.13) 


Here, without loss of generality, we can assume that each vector y„(x) is parallel to V( Y H ). Now 
since ( P„(y ) - c(x,y)) xe x Hy yeY H is uniformly bounded in n, we can choose ( a„(x ), y„(x)) in such a way 
that (a„(x),y„(x))„ is also uniformly bounded in n. Since X H is hnite, we can find a subsequence of 
(a„, y„ ) which converges to (a,y) at every x e X H . Then (a,y,p) is clearly a c-admissible triplet for 
H, establishing the claim on V F W . 

It is clear that the class {'T^j satisfies the finite intersection property, that is 0 4- 'Yh ] u...uh s £ 

n,-=i. s'Yhj- By the compactness of K and the closeness of 'Y H ' s, we deduce that the set 'Yq := 

C\hqg,\h\<co 'Yh is nonempty. 

We now claim that any f3 e 'Y (: is c-admissible for G. Indeed, fix x e X c , and [3 e 'Yq. We must show 
that there exists an affine function L x on V = V(Yq ) such that the following holds: 


P(y) ~ c(x,y) > L x (y), Vy e Y c (7.14) 

P(y) ~ c(x, y) - L x (y), Vy e G x . (7.15) 

Choose a finite set H x c G x such that V(H X ) = V(G X ). Observe that for any finite set F containing 
H := [x] x H x , 

L x (y) - P(y) - c(x,y ) - Lf(y) Vy e H x , hence L F x (y) - L H x (y) Vy 6 V(G X ). 

In particular, L F (x ) = L H X (x) since x 6 IC(G X ) c V(G X ). Let us define a(x) = L H X (x). 

Now we need to construct the last piece which is y(x). For this, in addition to //, we also choose a 
finite set {(v,, w,)}™ ( c G such that x e IC({Wj}™ t ) and L({w,}™,) = V, and define 

ff:=ffU {(v,-, *,-)}£!• 
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For any finite set F c G with H c F, define the set 

7f(x) := {v e V : fJ(y) - c(x,y) - a(x) > v ■ (y - x), Vy e Y F 
P(y) - c(x,y ) - a(x) = v ■ (y - x), Vy e F x \ 


(7.16) 

(7.17) 


The set y F (x) is nonempty because G itself is a c-contact layer. Now, since x e 7C({vi'/}” 1 ) and 
V({Wi}f_ j) = V, we deduce from (7.16 1 that y F {x) is a closed and bounded set in V, hence compact. 
Again since every subset of a c-contact layer is also a c-contact layer, the class \y F {x) : F 2 H) has 
the finite intersection property. Hence, we can choose a 

y(x) G Pi 7f(x). 

FDH,\F\<oo 

Finally, we show that ( 7. 14| > and ( 7.1 5[ ) hold for this choice of (a(x),y(x)). Indeed, let (x',y') 6 G, 
and let F = H U {(x', y')}. By ( |7,16| , we have /3(y') - c(x, y') > a(x) + y(x) ■ (y' - x), so (7.14 1 holds. 
Let y e G x . Let F = H U {(x,y)}. By \l Al\ , we have fi(y) - c(x,y) = a(x) + y(x) ■ {y - x), so (7.15 i 
holds. This completes the proof of Theorem |2.11| □ 


8. Structural results for general optimal martingale transport plans 

We start by proving Conjecture 2) in the case of a discrete target measure. 

Theorem 8.1. Let c(x,y) = |x — y|, suppose p « and that v is discrete, i.e. v is supported on a 
countable set. Let 7t e MT(p, v) be a solution of Q then for p a.e. x, supp7Tj consists of d ’ + 1 
points which are vertices of a polytope in W 1 , and therefore the optimal solution is unique. 

Proof. Since the result holds true (for more general target measures) when d = 1, we shall assume 
that d >2. Let S be the countable support of v and let / := {£ c S : |£j < oo & dim V(E ) < d- 1), 
where |£] is the cardinality of the set E. Consider Vj U eejV(E). Since dim V(E) < d — 1 and J is 
countable, it follows that E d ( Vj) = 0. Let T be a martingale-monotone regular concentration set for 
n (as in Dcfintion |5.2[ i. Let X := X\ \ V(J) so that p(X) - 1. Now notice that if x e A, then T v must 
contain vertices of a polytope which has x in its interior. 

Let now K := {E c S : \E\ - d + 2 and E contains vertices of a cZ-dimensional polytope). Fix F = 
{ycnyu ■■■,yd,y] in K , where yo,yi, ■■■,)’d are vertices of a cZ-dimensional polytope and consider the 
set A := {x e X : F c T r ). In other words, A = r'’ 0 Pi... n F vv n P', where L* 1 := {x : (x,y) e T). We 
shall prove that p(A) = 0. 

Indeed, suppose otherwise, that is p(A) > 0 and let xo be a Lebesgue point of A. Let B = A n C(xo) 
and note that -C d (B) > 0 since C(xo) is open in . Since the set T n (C(xo) X H d ) is a c-contact layer, 
there exist constants A o, A\ ,..., A,/, A such that for all x e B, we have 

|x - >’,j + y(x ) • (y, - x) + a(x) = A,-, i = 0,1,..., d 
|x — y\ + y(x) ■ (y - x) + a(x) = A. 

Also note that {yo,yi, £ Ext(conv(r v )) for almost all x e B. Let p, be determined by 

y = Y, d =0 pyi, and 'L d p i = 1, and note that some p, may be negative. Then, by the above, we get that 
the function 

g(x) := 2f =0 p/|x - y,-\ - |x - y\ 
is constant on B , which has positive measure. 

We explain why this leads to a contradiction. First, notice that because g is real analytic in Q ;= 
\ {yo, ■■■,yd,y], it is not constant in any open subset, since otherwise it is constant everywhere, 
which is not the case. Second, without loss of generality, assume xo = 0 and g(0) = 0, and notice 
that from the real analyticity of g, one can write g(x) = Iffx) + Q(x) for some k e H, where P/fx) 
is the first nonzero k-th degree homogeneous polynomial, and Q(x) is a power series of terms with 
degree greater than k, in particular, Q{x ) = 0(|x| A+1 ). Now, consider the set 

S := {u e S d ~ l | there exists 0 + x„ ~^> 0, x„/|x„| —» m, with 0 = g(x„)}. 
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Then, for each u G S, 0 = = Pk(x n /\x„\) + j^pr, showing P/fu) = lim„_ co Pk(x„/\x„\) = 0. Thus, 

5 is a subset of the zero set {m; P/,(u) = 0). 

Now if g is zero on the set B where xo is a Lebsegue point, then S = S ,l 1 , hence Pi, = 0, a 
contradiction. Hence, p(A) = 0. The countability of K now implies the theorem. 

For the uniqueness, we use the usual argument, namely that the average of two optimal plans is also 
optimal, which contradicts the polytope-type of their respective supports. □ 


Remark 8.2. As we see from the above proof. Theorem 8.1 holds true for a much more general 
cost c(x,y) than |x - y |. Indeed, it is enough (but not necessary) c(x,y) to be analytic in {x + y}, 
and the function g(x) = X/lo Pi c (x, - c(x,y) to be non-constant. In particular, we can choose 
c(x,y) = \x - y\ p , with p + 2. 


We now establish Conjecture 1) in the 2-dimensional case. 


Theorem 8.3. Assume d = 2, c(x,y ) = \x—y\, p is absolutely continuous with respect to the Lebesgue 
measure, and v has compact support. Let n e MT(p, v ) be a solution of m then for p almost every 
x G R 2 , supp ttx = Ext (conv(supp n x )). 



we show that the set 


E n := {x G X | supp 7^ c Ext(conv(supp7r A ))} 

has full //-measure. First note that E n is measurable by Proposition |D.1| (Here, we used the fact 
that each of supp n x c R rf is compact, which is satisfied since the second marginal of n is compactly 
supported.) 

We shall show that its complement N = X\ E has //-measure zero and since p « £} it suffices to 
show that -C 2 (N) = 0. For that note first that the set Xq := {x G X : dim(conv(r v )) = 0) is obviously 
included in E, which means that N = (N n X 2 ) U (N Pi X\), where 

X 2 — {x G X : dimF(C(x)) = 2) and 2£j = {x G X : dimV(C(x)) = 1), 

where jC'(x); x G X] is the irreducible convex paving of T. 

Note that Xi = 'J x ex 2 (X n C(x)) = X fi (U x ex 2 C(x)). Sinc e U x£ x 2 C(x) is open, Xi is measurable. But 
since T n (C(x) X R 2 ) is a c-contact layer. Theorem |2.3 [ yields that T v = Ext (conv(r A )) for a.e. x in 
Xi n C(x). Since Xo can be approximated by compact sets from the inside and {C(x)} xe x 2 is an open 
cover of X 2 , we conclude that r c = Ext (conv(T x )) for a.e. x in AV Hence, £}(N Pi Xi) = 0. 
Consider now the measurable set A\ := N Pi Aj, and assume that £}(A \) > 0. Note that for every 
x G A 1 , we have that 7(x) := /C(supp n x ) is an open line segment with x in its interior. Note that 
7(x) c C(x) and C(x) is one-dimensional for every ig A\. By Proposition |[.).2[ the function defined 
for each x G A \ by 

d(x) = supjr; (x - r, x + r) c 7(x) 

is measurable, where (x - r,x + r) denotes the interval of radius r at x inside the line segment 7(x). 
Therefore, the set A 6 {x G A t : 6(x) > ()) for every 6 > 0 is also measurable, and If (A,,) > 0 
for some 6 > 0. Let now xq be a Lebesgue point of Ag, and consider W to be the 1-dimensional 
affine space containing xo and perpendicular to I(xq). Choose e > 0 much smaller than 6 and let 
Ag, e '■= Ag Pi B(xq,s) (note £}{Ag tE ) > 0). Then (C(x); x G Ag tE } is a disjoint family of open segments 



( E(C(x)); x G A, ) f: } is a parallel cover of E(A l)r ), so by Fubini’s theorem with bi-Lipschitz map F, 
we conclude -C 2 (7*'(A ae )) = 0, which is a contradiction. (Here for the Fubini’s theorem, we used the 
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fact that F{A e s) is measurable.) It follows that £}(A\) = 0, which then results Xr(N) = 0. This 
completes the proof. □ 

The same proof could extend to higher dimensions, provided one can prove measurability of the 
function 


X r 3 xH di(x) = sup {r > 0 : B(x, r ) c C(x)} 

defined for a given convex paving (C(x)) x€ x v associated to T. One can then obtain the following. 

Theorem 8.4. Assume c(x,y ) = \x — y| on R d x R rf and let n 6 MT(p, v) be a solution of o> with 
a martingale-monotone regular concentration set T. Assume p is absolutely continuous with respect 
to the Lebesgue measure and that 

the function 6 is measurable, and dim(V(C(x))) > d — 1 for p a.e. x, (8.1) 

where ( C(x)) xe x T is the irreducible convex paving associated to T. Then, for p almost every x e R rf , 
supp n x = Ext (conv(supp n x )). 


9. The disintegration of a martingale transport plan 


For a closed convex set U c R d , let 'K(U) be the space of all closed convex subsets in R rf , 
equipped with the Hausdorff metric in such a way that it becomes a separable complete metric 
space (Polish space). This allows for the disintegration of a measure n on X via a measurable map 
T : X 3 K(U) (see e.g. 0 Corollary 2.4]) in such a way that each piece of the disintegrated 
measure, say n c , is a probability measure on T 1 (C). In particular, n c (T UC)) = 1 for T#n-a.e. 
C e ( K{U), ultimately yields conditional probabilities. 

Consider now a set T e Smt and the corresponding unique irreducible convex paving {C; C e <!>} 
as given in Theorem 2.8 Define the map 

S : T —> Tear') by (x, y) 1-3 C(xj, 


where l K(R d ) is the space of convex closed subsets of R £/ . We conjecture that this map is measurable 
when 9C(R d ) is equipped with the Hausdorff metric that makes a separable complete metric space. 
In this case, we shall show that a martingale transport plan n can be canonically disintegrated into its 
components given by (Tn(C(x)xR c, )) A(E x r - As usual, in the case of minimization withc(x,y) = |x-y|, 
we shall assume further that p A v = 0. 


Theorem 9.1 (Disintegration of martingale plans). Let (/(, v) be probability measures on R d in 
convex order and let n e MT(p, v ) with a concentration set T e Smt and the associated irreducible 
convex paving {C; C e <t>). Assume the map E : T — 3 r K(R d ) defined by (x,y) 1-3 C(x), is measurable, 
and let n = S#7r denote the push-forward of n into 7C(R rf ), and I C '^(R'^) is the image ofT by E. 
Then the following holds: 

( 1) There exists a disintegration ofn along the map S such that 


n(S) = J^c(S)dn(C) 


for each Borel set S c R rf x R d , 


(9.1) 


where for n-a.e. C, Tic is a probability measure supported on Tc := T Pi (C X R rf ). 

(2) For n-a.e. C e I, there exist probability measures pc,vc such that the couple (pc,vc) is in 
convex order, pc is supported on Xq := Xr n C, vq on Yr c , and nc e MT(pc, vc)• 

(3) If n is optimal for problem o in MT(p , v), then for n-a.e. C e /, nc is optimal for the same 
problem on MT(p c , v c ). Furthermore, T c is a c-contact layer. In particular, duality is attained 
for n c . 

(4) If in addition, pc is absolutely continuous with respect to the Lebesgue measure on V(C), and 
c(x,y ) = |x - y\, then for pc-almost all x, f , = Ext (conv(r t )). 
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Proof. The above discussion and the measurability hypothesis of the map E : T —» 'KOt ) defined 
by (x,y) i— > C(x), yield the disintegration of n into n c dn(C) in ( |9. 1 [ >, with n c supported on r C ;. The 
measures )uc,vc are obtained by taking marginals of nc . The martingale and optimality properties 
of nc for 7r-a.e. C, follow from those properties of 71 and the disintegration m- When n is an 
optimal martingale transport, the concentration set T can be chosen in such a way that it is c-finitely 


exposable, hence the set T c is a c-contact layer by Theorem 2.11 This deals with items (1), (2), and 
(3) of the theorem. Finally, (4) follows immediately from Theorem 2.4 □ 


In order to apply this theorem and deduce global results from its local properties, one would like 
to know when we can disintegrate p into absolutely continuous pieces p c . so as to apply Theorem 2.4 
on each partition. We start by a counterexample showing that this is not possible in general, at least 
in dimension d > 3. 


Nikodym sets and martingale transports: Ambrosio, Kirchheim, and Pratelli (2] constructed a 
Nikodym set in R 3 having full measure in the unit cube, and intersecting each element of a family 
of pairwise disjoint open lines only in one point. More precisely they showed the following. 

Theorem 9.2. (Ambrosio, Kirchheim, and Pratelli (2)) There exist a Borel set M N c [-1, l] 3 with 
|[—1, l] 3 - M^\ = 0 and a Borel map f — (J\ , ff) : M^ —» [—2,2] 2 X [—2,2] 2 such that the following 
holds. If we define for x € Mn the open segment l x connecting (f\ (x), -2) to (/2(x),2), then 

• {x} = l x n Mn for all x e M^, 

• l x H l y = 0 for all x 4 y € M^. 


Example 9.3. One can use the above construction to constmct an optimal martingale transport, 
whose equivalence classes are singletons, hence the disintegration of the first marginal along the 
partitions C(x) is the Dirac mass 6 X , which is obviously not absolutely continuous w.r.t. £}. 
Consider the obvious inequality ^(|x - y| - s) 2 > 0, and its equivalent form 


1 


lyl 2 > 


■ y\ + -x • 
£ 


(y - x) 


1 


■\x\ 


(9.2) 


2s s 2s 

Thus by letting a e (x) = ^[x| 2 - s, /3 e (y ) = ^|y[ 2 and y £ (x) = ^x, \9.2) yields that the set T = 
{(x,y); |x—y| = e) is a c-contact layer, where c(x,y) = \x-y\ in the maximization problem. It follows 
that every martingale n r ( X , Y) with \X—Y\—e a.s. is optimal with its own marginals X ~ p and 
Y ~ v. 

Now fix e > 0 small and let X be a random variable whose distribution p has uniform density on 
[-1, l] 3 . We define Y conditionally on X by evenly distributing the mass along the lines l x considered 
in Theorem 9.2 and distance s. that is Y splits equally in two pieces from x e X along l x with distance 
s. Then the martingale (X, Y) is optimal for the maximization problem. But note that in this case, 
each equivalence class [x] is the singleton {x}, so the disintegration of p along the partitions C(x) is 
the Dirac mass 5 X , which is obviously not absolutely continuous w.r.t. £}. Hence, the decomposition 
is not useful in this case. One also notices that the convex sets associated to the irreducible paving 
of the martingale (X, Y) have codimension 2. We leave it as an open problem whether one can do 
without assumption (|8.1|) in Theorem 8.4 


Remark 9.4. By letting s —> 0, the above problem approaches the one considered in Example |5.7| 
that is the case when the marginals p — v are equal, the only maximal martingale transport is the 
identity, and the value of the maximal cost is zero. On the other hand, note that f j8 s (y)dv(y) - 
f a £ (x)dp(x ) = e. which means that y, : ) is a minimizing sequence for the dual problem. But 

neither of the sequences a e ,/3 s , y E converge (neither pointwise nor in L 1 ). This is another manifesta¬ 
tion of the non-existence of a dual in example [5. 7 1 This said, for the minimization problem, we have 
no example where duality is not attained. 


Appendix A. A suitable concentration set for a martingale transport plan 
Here we prove the following lemma which was introduced in Section]?] 








32 


NASSIF GHOUSSOUB, YOUNG-HEON KIM, AND TONGSEOK LIM 


Lemma A.l. Let n € MT(p, v) and let Ac^xR^ea Borel set with n(A) = 1. Then there exists 
a Borel set T C A with 7r(T) = 1 such that the map x i— > n x is measurable and defined everywhere on 
Xr in such a way that: 

(1) r v = supp 7t x for all x e Xr, 

(2) Te Smt, that is x e 7C(r v )/or all x e Xr, 

(3) If we assume that p « Jf 1 , then I can be chosen in such a way that X r £ IC(Y\). 

(4) If in addition n is a solution of the optimization problem Q then r can be chosen to be 
finitely c-exposable. 


Proof. Let (n x ) x be the unique disintegration of n with respect to p. It is well known that this yields 
a well-defined measurable map x fa 7t x on a Borel set E in R d with p(E) = 1 such that each x in E 
is the barycenter of it x and 7t x (A x ) — 1. It is clear that x e CC(A X ). However, it is not necessarily in 
IC(A X ). Note however that for any Borel set B in R rf , the map x fa n x (B) is Borel measurable, hence 
for each r > 0, the set B r { (x, v) [ x e E. n x (B r (y)) > 0} is Borel (Here, B r (y) is the open ball with 
center y and radius r in R d ) and consequently the set © := {(x,y) | x e E, y e supp (n x )} = Bi/« 
is also Borel. Letting I := A Pi 0, it is clear that 7r(r) = 1 and n x (T x ) = 1 for all x e E. Finally note 
that the probability measure n x has its barycenter at x and that r c £ supp (7r v ), and since n x (r x ) = 1, 
we have that Y x = supp7r v . Hence in particular, x e IC(F X ) for x e E, proving (1) and (2). 

Item (3) can be obtained by considering another subset of I . Indeed, let X' be the set of Lebesgue 
points of Xr . Then as p « , we have p(X') — 1. Let P := Y Pi (X' x R' / ). Then, T' e Smt, 

n(Y') = 1 and X' c IC(X') c IC(Yr), as claimed. 

For item (4), we use 0, ED, where it is shown that for an optimizer n, there exists A with n(A) = 1, 
that is finitely c-exposable (also called finitely c-monotone in 0; see Definition 2.9 1 . We then 
restrict A to get T which also satisfies (1), (2) and (3) by the above procedure. □ 


Appendix B. An estimate for convex functions 

We prove here a technical result -used in Section[6]- that allows us to control the maximum of a 
convex function by the integral of its second derivatives. Namely, 

Proposition B.l. Let B r denote the closed ball of radius r centred at the origin 0. Let ip be a (smooth) 
convex function such that ip(0) = 0 and tp > 0. Then, 

I Aip > Cnr d ^ 2 max tp, (B.l) 

Br 

where the constant Cq > 0 depends only on the dimension d. 


Proof. Denote M r = max Wi tp. By the maximum principle a point, say p e dB r , can be chosen from 
the boundary so that tp{p) = M r . Choose an orthonormal basis rp ,.... //,/ such that p = rip , and define 
a cylindrical set (of radius r/2) 


Kr ■- | ^ tjl] 



We will show that 


1 


D 2 n tp > C 0 r‘ 


,n -2 


max cp 
B r 


(B.2) 


for a constant Co > 0 depending only on the dimension d. This will immediately imply the desired 
estimate ( | B. 1 [ ) because K r c B % p r and that 0 < D 2 n tp < Atp for the convex function tp. 

To show ( |B.2| >, we let H denote the hyperplane (zi = 0). Notice that p(0) = 0 and tp < M r on B n 
thus from convexity of tp, we see that 


\z\ 

tp(z) < - (M r 
r 


H 

• <p(0)) = -M r 
r 


for each z 6 B r . 


(B.3) 
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Also, notice the fact that p = rr\\ is a maximum point of p in B, and that the hyperplane rrp + H 
stays outside the interior of B ,, i.e. rrj\+ H c \ ( intB r ). So, from convexity of ip, we have 

i p(z + rrji) > M r for each z e H. 

Combining this with ( |B.3| > and using convexity of ip again, we can estimate the derivative D i f on 
the set rr)\ + (H Pi B r ). Namely, for each 3 e // Pi B r , 


Diip(z + rrii) > - ( if(z + rp x ) - p(z)) > - 
r r 


M r ■ 


kl 


M A = -^{r-\z\W r . 


Similarly, use ( |B .3| >, the fact that ip > 0, in particular on -rrp + H , and the convexity of ip to see that 

1 \ z \ 

Dnp(z - rrj 1 ) < - (ip(z) - f(z - rip)) < —M r , for each z & H n B r . 
r r~ 

From these estimates on D\ip, we have that for each z e H Pi B r , 


£ 


D 2 n ip(z + tip)dt = D\ip(z + rpi) - D\tp(z - rrj\) 


1 


kl 


> -jir - \z\)M r - ^M r 

r Z r A 


Now, 


where 


= -f(r-2\z\)M r . 


f D 2 u ifdz= f f D 

K r JzeHnB r /2 —r 

>-£ 


if(z + tij\)dtdz 


—(r - 2\z\)M r dz 

zeHnB r/2 r 


= C 0 r d ~ 2 M r 


Co = r 


2-d 


f - 2\z\)dz = f (1 - 2\z\)dz 

d HC\B r /2 ^ d HC\B\j2 


H(lB\ 2 


is independent of r. Notice that Co > 0 because kl varies from 0 to 1/2 on H Pi B 1 / 2 . This completes 
the proof. □ 


Appendix C. A bi-Lipschitz flattening map 

The following lemma, which describes a bi-Lipschitz “flattening map” was used in Section[8] 

Lemma C.l. Let K/ 1 ' = V x W, where V = R . rf_1 and W — R. Let 6 > 0 and let A be a subset 
ofW. Suppose that for each h e A, there is a set D/, which is contained in a hyperplane Hi, with 
Hi, fl W — {0,..., 0, h). Suppose further that {Dh)heA are mutually disjoint and the projection of 
every {D;,} on V contains the ball Br with center 0 and radius R in V. Finally, suppose that the 
angle between H/, and W is bounded; there is 77 < zr/2 such that the normal direction of Hi, and the 
direction ofW has angle less than ijfor every h e A. 

Now define the flattening map F : U/,D/, —> R rf as follows: for x = (v, w) 6 D/„ F(v,w) = (v,h). 
Then F is bi-Lipschitz on the set N := (U i,Dij) Pi (B, x W), where r < R. 

Proof First, note that by the disjointness of {D/,} the map F is bijective, so F 1 is well-defined. 
The lemma is intuitively clear; the map F cannot move two nearby points too far away, because the 
hyperdiscs {£)/,} are disjoint. 

First of all, from the bounded angle assumption, F is clearly bi-Lipschitz on each F(D/,) with the 
same Lipschitz constant for all h e A. Hence, for xi = (v ; i,wi), X2 = (i'2, W2), we will assume that 
x 1 , X 2 are contained in /)/,,, /)/,, respectively, and h\ 4- h 2 . 

We consider the case V! = i ; 2 6 V and |vi| = IV 2 I < r. Let L be the 1-dimensional subspace of V 
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containing 0 and vi. Regarding Z)/,,, Z)/, 2 as affine functions on V , since their graphs on L n B R are 
disjoint and linear and r < R, it is clear that |wi - W 2 I ~ \h\ - / 22 I; i.e. 

C\\h\ - h 2 \ < \w\ - w 2 \ < C 2 \h\ - / 22 I for some C\, C 2 > 0. (C. 1) 

Next we consider the case iq + V 2 . We want to show |xi - x 2 \ ~ |F(jri) - Ffx 2 )!, or equivalently, 

|wi - w 2 | ~ |Zz 1 - h 2 \. 

Let L be the 1-dimensional subspace of V containing vi and V 2 . Regarding D/,,, /J/, 2 as affine func¬ 
tions on V, since their graphs on L fi Br are disjoint and linear, it is clear that 

[wi - w 2 | = \D hl (vi) - D h2 (v 2 )\ < max(\D hl (vi ) - £>/, 2 (vi)|, |£>/ 2l (v 2 ) - £>/, 2 (v 2 )|). 

But by ( |C. 1 1 >, we have 

max(|£> /?1 (vi) - L>/,,(v'i)|, \D, n (v 2 ) - D hl (v 2 )\) < C 2 \hi - h 2 1, 
which shows that F 1 is Lipschitz on F(N). On the other hand, by <|CT}, we have 
|/ii -h 2 \ < (l/Ci)mm(\D hl (vi) - D h2 (yi)\,\D hl (v 2 ) - D h2 (v 2 )\) 
and, again regarding D/ u , /!/,, as disjoint linear graphs on L n B R , we have 

min(|D Al (vi) - D/, 2 (vi)|, \D hl (v 2 ) - D hl (v 2 )|) < \D hl (v\) - Z3;, 2 (v 2 )| = |Wi - w 2 \ 
which shows that F is Lipschitz on N, and the proof is complete. □ 

Appendix D. Proofs of measurability 

We now establish the following proposition which was used in the proofs of Section[8] 

Proposition D.l. Let n be a Borel measure on the product space F/ / x and let A c he a 
concentration set for its first marginal. Let x n x be the corresponding disintegration map from 
A to P{ Mf) and assume that for each x e A, the set supp7t x c R 0 ' is compact - which is satisfied in 
particular, if the second marginal ofn is compactly supported. Then, the set 

E n := {x e A \ supp;r v c Ext(conv(supp 7r x ))} 

is a Borel measurable set in R rf . 

Proof. Let N„ = A \ E n , that is, 

JVj = {r e A | supp tt x <t Ext(conv(supp 7r v ))}. 

We will show that there is a measurable set N in E. d such that N„ £ N and E„ fl N = 0, which then 
implies that the set E„ - A\N is measurable, as desired. 

We shall use a classical result of Caratheodory, which implies that a point z e supp tt x is not an 
extremal point of the convex hull of supp n x if and only if it lies in the relative interior of an r-simplex 
(1 < r < d) with vertices in supp n x . First choose a countable dense subset gcR 1 ' and associate to 
each q e Q an (e, d)-admissible r-simplex S c R d , defined as follows: 

(1) all the vertices of S belong to Q, 

(2) q is e-close to a (relative) interior point of S , and 

(3) all vertices of S are d-away from q. 

Let AA St s{q) denote the countable set of all (e, bj-admissible simplices for q. Now define the set 

S e ,s(q) := {x e A \ n x (B e (q)) > 0 & there exists S e ^ E .s(q) such that 
for each vertice qj of S , nJBJqj)) > 0). 

This set S e j(q) contains all those points x in A, such that supp n x include, up to an e-error, both 
the point q and the vertices of an (e, d)-admissible simplex for q. Since the map x 1 —> 7T X g P(R d ) 
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is measurable, each set S e j(q) is measurable, since it can be written as the countable union of 
measurable sets. Define N s g := Uq £ Q S B j(q), and set 

tel j>i 

It is clear that N is measurable. We now show that it has the desired properties. 

Claim 1: N n £ N. Indeed, for any x e N„, there exists ate supp n x lying in the relative interior of 
an r-simplex, say S , with vertices in supp n x . Let do > 0 be a lower bound for the distances from z 
to the vertices of S as well as the distances between any two vertices. Fix k e N large enough so that 
do > () = 2~ k+1 . Since Q is dense, one can find for each e = 2~i~ k , j > 1, a point q e B e (z) and an 
(e, d)-admissible simplex S , for q whose vertices are e close to the vertices of S. This implies that 
for each j e N, x e S E ,g(q) where e = 2~'~ k and d = 2~ k+l . This shows that x e H /ai A^ 2 -m, 2 -* £ N 
as desired. 

Claim 2: E n C\N - V). Indeed, suppose not then there exists x G E n n n ; >i Ni-i-kg-t for some k e N. 
Let d = 2 L and Sj — 2 'I~ k for each j > 1. Then, we see that for each j > 1, there exists qj € Q 
and a simplex, say S that is (e ; -, d)-admissible for qj such that qj and the vertices of S ; are Sj close 
to supp 7i x . Since supp n x is compact by assumption, there exists a convergent subsequence of {i/ ; ), 
as well as a convergent subsequence of the simplices {S j) (in the Hausdorff topology since their 
vertices converge). Let q^, S <*, denote their limits (as j —> °o), respectively. Note that q : „ e supp n x 
and that Sec is a simplex with vertices in supp jt x . By the definition of {sj, ^-admissibility, we also 
have that qoo belongs to the closure of Sec, while being d-away from its vertices. This implies that 
the point q m e supp^ is not an extremal point of the convex hull of supp it x . This contradicts the 
fact that x e E n , thus completing the proof of Claim 2 and the proposition. □ 

Next, we show the following. 

Proposition D.2. Let n be a Borel measure on the product space X and let x i—> n x e P(W 1 ) 
be its disintegration along the first marginal. Let A C R rf be a Borel measurable set that is a 
concentration set for the first marginal ofn, and denote for each x e A, the set l(x) — 7C(supp n x ). 

Assume that I(x) is bounded and that x e I(x)for each x e A. If A\ c A is a measurable set such 
that for each x e A\, dim/(x) = 1, then the function w : A\ —» R+ defined by 

w(x) := min[dist(x,}’o), dist(x,>’i)] 

where yo,y\ are the end points of the segment I(x), is Borel measurable. 

Proof. It is enough to show that for each d > 0, the set Mg = {x e Ai \ w(x) > d) is Borel measurable. 
For that, we again consider a countable dense subset Q c if. For s, d > 0, we say that a closed 
segment S — [< 70 , <7i] connecting two points po, p\ e R rf is (s, d)-admissible for q e Q if 

(1) Pq,P\ g Q : 

(2) q e N s (S), the latter being the e-tubular neighborhood of S ; 

(3) dist(p ; , q) > d, for i = 0,1. 

Let IR e g{q) denote the countable set of (e, d)-admissible segments for q, and define the set 

S e j(q) := {x e | dist(x,^) < e & there exists [po,p\\ e E ,g(q) with n x (B E (pi )) >0, i = 0,1). 

The set S e o (c/) contains those points x in A \, such that x is e-close to q, while supp n x includes up to 
e, the end points of an (e, d)-admissible segment for q. Again, each set S Ex) (q) is measurable, since 
the map x i-> n x € P(R d ) is measurable. Define the set M e6 (J q eQ S E ,s(q), an d set 

M s = Mc-ij. 
i> 1 

It is obvious that Mg is measurable. We claim that 

Mg = M s . (D.l) 
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Indeed, we first verify that M$ c Ms. To see this, consider an arbitrary point x € Ms, and let 
yo>yi be the two end points of the segment I(x). Then for each 0 < s < 8, there is q e Q and 
S = [po,P\] € jB £ / 3 ,s(q) such that x € B £ / 2 (q) and 7i x (B £ /2(pi)) > 0 for i = 0,1. From the last 
condition, we see that po,p\ € A^/2(supp7r Jt ) and hence S € N e / 2 (I(x)). Moreover, from the item (3) 
for the (s, ^-admissibility of S together with x € B £ / 2 (q), we see that dist (pt,x) > 6 — e/ 2, which 
then implies that dist(x,y,) > 8 — e. Since s > 0 was arbitrary, this implies that x e Ms as desired. 
For the reverse inclusion M s Q Ms » note that for each x e Ms , we have dist(y ? , x) > 8, i = 0,1, where 
yo,yi are the end points of the segment I(x). Also, notice that y, € supp n x , i = 0,1. Since Q c R d 
is dense, one can find for each 0 < s < 8, a point q e Q and a segment S = [po,P\] c A £ j(q) such 
that q e B £ (x ), and pi e B £ (yi )), i = 0,1. It follows that x € S e j(q) which implies x e M £ j for all 
0 < s < 8, thus x £ Ms- This completes the proof. □ 
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